- Previous thread: Running HBase Junit Testcases on local machine
- Next thread: Hadoop on PVFS2
- Threads sorted by date: hadoop-hbase-user 201007
Hello all,
I am running a job from jython that is importing time series data into HBase. I started to see the following messages and wanted to dive deeper to find out if they are true errors or just debug messages:
10/07/23 09:51:07 DEBUG client.HConnectionManager$TableServers: Reloading region subset,a40506-2016/07/23-20:33:30.296,1279902520534 location because regionserver didn't accept updates; tries=0 of max=10, waiting=1000ms
10/07/23 09:51:08 DEBUG client.HConnectionManager$TableServers: Cached location for .META.,,1 is 10.10.11.3:60020
10/07/23 09:51:08 DEBUG client.HConnectionManager$TableServers: locateRegionInMeta attempt 0 of 10 failed; retrying after sleep of 1000 because: No server address listed in .META. for region subset,a40506-2016/07/24-07:00:35.528,1279903897169
10/07/23 09:51:09 DEBUG client.HConnectionManager$TableServers: Cached location for subset,a40506-2016/07/24-07:00:35.528,1279903897169 is 10.10.11.2:60020
I did some searches on google and this seems to point at the potential lack of memory. Currently, HBase is setup with a heap of 2G for each slave, and there are 6 slaves. Each slave has a total of 8G of RAM installed. If you guys have any guidance on what other settings I should look for, please let me know.
Thanks!
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain confidential or privileged information. Any unauthorized review, dissemination, distribution, or copying of this communication is prohibited. If you are not the intended recipient, please notify the sender immediately by reply e-mail, and destroy all copies of this message and any attachments from your files.
I am running a job from jython that is importing time series data into HBase. I started to see the following messages and wanted to dive deeper to find out if they are true errors or just debug messages:
10/07/23 09:51:07 DEBUG client.HConnectionManager$TableServers: Reloading region subset,a40506-2016/07/23-20:33:30.296,1279902520534 location because regionserver didn't accept updates; tries=0 of max=10, waiting=1000ms
10/07/23 09:51:08 DEBUG client.HConnectionManager$TableServers: Cached location for .META.,,1 is 10.10.11.3:60020
10/07/23 09:51:08 DEBUG client.HConnectionManager$TableServers: locateRegionInMeta attempt 0 of 10 failed; retrying after sleep of 1000 because: No server address listed in .META. for region subset,a40506-2016/07/24-07:00:35.528,1279903897169
10/07/23 09:51:09 DEBUG client.HConnectionManager$TableServers: Cached location for subset,a40506-2016/07/24-07:00:35.528,1279903897169 is 10.10.11.2:60020
I did some searches on google and this seems to point at the potential lack of memory. Currently, HBase is setup with a heap of 2G for each slave, and there are 6 slaves. Each slave has a total of 8G of RAM installed. If you guys have any guidance on what other settings I should look for, please let me know.
Thanks!
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain confidential or privileged information. Any unauthorized review, dissemination, distribution, or copying of this communication is prohibited. If you are not the intended recipient, please notify the sender immediately by reply e-mail, and destroy all copies of this message and any attachments from your files.
This is just our noisy client talking about the caching of region
locations out on the cluster (You are at DEBUG level). Turn off DEBUG
in client if you'd rather not see the messages Did they jython page up on wiki help?
Yours,
St.Ack
On Fri, Jul 23, 2010 at 9:58 AM, Andrew Nguyen
locations out on the cluster (You are at DEBUG level). Turn off DEBUG
in client if you'd rather not see the messages Did they jython page up on wiki help?
Yours,
St.Ack
On Fri, Jul 23, 2010 at 9:58 AM, Andrew Nguyen
St.Ack,
Thanks for the clarification - I just wanted to get confirmation that debug messages were just debug and not potentially indicative of something starting to go wrong. I happened to be looking at the DFS usage % and it keeps going up and down (I figured it should only be increasing) so that got me looking at the job's log...
The jython page on the wiki was extremely useful. I actually had never used jython before but am a big fan of python for getting stuff up quickly so it seemed to be a natural progression. Having said that, I am looking at importing a ton of rows (not sure how much but hundreds of millions to billions). Are there any good examples on doing this as efficiently as possible? And, how does jython compare to a pure Java approach?
Currently, I have a for loop just calling table.put(p) repeatedly. I also have WAL disabled, autoflush set to false, and increased the buffer. Anything else I should consider?
Thanks!
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain confidential or privileged information. Any unauthorized review, dissemination, distribution, or copying of this communication is prohibited. If you are not the intended recipient, please notify the sender immediately by reply e-mail, and destroy all copies of this message and any attachments from your files.
> in client if you'd rather not see the messages
Thanks for the clarification - I just wanted to get confirmation that debug messages were just debug and not potentially indicative of something starting to go wrong. I happened to be looking at the DFS usage % and it keeps going up and down (I figured it should only be increasing) so that got me looking at the job's log...
The jython page on the wiki was extremely useful. I actually had never used jython before but am a big fan of python for getting stuff up quickly so it seemed to be a natural progression. Having said that, I am looking at importing a ton of rows (not sure how much but hundreds of millions to billions). Are there any good examples on doing this as efficiently as possible? And, how does jython compare to a pure Java approach?
Currently, I have a for loop just calling table.put(p) repeatedly. I also have WAL disabled, autoflush set to false, and increased the buffer. Anything else I should consider?
Thanks!
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain confidential or privileged information. Any unauthorized review, dissemination, distribution, or copying of this communication is prohibited. If you are not the intended recipient, please notify the sender immediately by reply e-mail, and destroy all copies of this message and any attachments from your files.
> in client if you'd rather not see the messages
On Fri, Jul 23, 2010 at 10:18 AM, Andrew Nguyen
There is an old blog of Ryan's from back when he was doing all he
could to not sully his paws with dirty java:
http://ryantwopointoh.blogspot.com/2009/01/performance-of-hbase-importing.html
Its an old post. Jython may have come on since then.
You are on the right track. You might want to move to java but do the
timing first.
There is also http://hbase.apache.org/docs/r0.20.5/api/org/apache/hadoop/hbase/mapreduce/package-summary.html#bulk
which has been buggy up to this though should be working now. Its
good if you are doing single-columnfamily only imports. Usually you
can see order-of-magnitude improvement in speeds bulk inserting. This
bulk load facility got redone completely in TRUNK, and for sure it
works now. Its super fancy; you can even bulk load into a running
table; read more here:
http://hbase.apache.org/docs/r0.89.20100621/bulk-loads.html
St.Ack
There is an old blog of Ryan's from back when he was doing all he
could to not sully his paws with dirty java:
http://ryantwopointoh.blogspot.com/2009/01/performance-of-hbase-importing.html
Its an old post. Jython may have come on since then.
You are on the right track. You might want to move to java but do the
timing first.
There is also http://hbase.apache.org/docs/r0.20.5/api/org/apache/hadoop/hbase/mapreduce/package-summary.html#bulk
which has been buggy up to this though should be working now. Its
good if you are doing single-columnfamily only imports. Usually you
can see order-of-magnitude improvement in speeds bulk inserting. This
bulk load facility got redone completely in TRUNK, and for sure it
works now. Its super fancy; you can even bulk load into a running
table; read more here:
http://hbase.apache.org/docs/r0.89.20100621/bulk-loads.html
St.Ack
Thanks for the info. I actually used that blog post as a starting point for my work with jython.
I will also take a look at the bulk loading you referenced below. We are currently only doing single-cf imports.
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain confidential or privileged information. Any unauthorized review, dissemination, distribution, or copying of this communication is prohibited. If you are not the intended recipient, please notify the sender immediately by reply e-mail, and destroy all copies of this message and any attachments from your files.
>>
I will also take a look at the bulk loading you referenced below. We are currently only doing single-cf imports.
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain confidential or privileged information. Any unauthorized review, dissemination, distribution, or copying of this communication is prohibited. If you are not the intended recipient, please notify the sender immediately by reply e-mail, and destroy all copies of this message and any attachments from your files.
>>
Related Threads
- Vala -
and - vala-list - rules-dev - Package based import/export in Guvnor - jboss-rules-dev
- helping - freebsd-questions
- rhelv5-list - rhelv6-beta-list - KVM performance is very poor? - redhat-rhelv5-list
- Installing Debian from USB stick - debian-user
- PATCH - fix building on non-Linux systems (e.g. BSD) - linux-usb
- Problem with page being redirected/loaded over and over again - wicket-users
- pkg-discuss - 13536 We need a way to audit one or more packages - opensolaris-pkg
- DSPAM - freebsd-questions
- wpa_supplicant EAP-SIM and EAP-AKA auth methods - hostap
- android-developers - Calling all developers who are passionate about health care - android-developers
- Re: Firebird-devel - Android linux - strategic port - Email found in subject - Email found in subject - firebird-devel
Related Lists
- activemq-users
- ant-ivy-user
- ant-user
- axis-c-user
- axis-java-user
- buildr-users
- camel-users
- cayenne-user
- click-user
- cocoon-users
- commons-user
- continuum-users
- cxf-users
- db-derby-user
- directory-users
- felix-users
- geronimo-user
- hadoop-chukwa-user
- hadoop-common-user
- hadoop-general
- hadoop-hbase-user
- hadoop-hive-user
- hadoop-mapreduce-user
- hadoop-pig-user
- hadoop-zookeeper-user
- harmony-dev
- hc-httpclient-users
- httpd-users
- ibatis-user-java
- incubator-general
- jackrabbit-users
- jakarta-jmeter-user
- james-server-user
- logging-log4j-user
- lucene-general
- lucene-java-user
- lucene-lucene-net-user
- lucene-mahout-user
- lucene-nutch-user
- lucene-solr-user
- lucene-tika-user
- maven-users
- mina-ftpserver-users
- mina-users
- myfaces-users
- ode-user
- ofbiz-user
- openejb-users
- openjpa-users
- pdfbox-users
- perl-modperl
- pivot-user
- poi-user
- portals-jetspeed-user
- qpid-users
- servicemix-users
- sling-users
- spamassassin-users
- struts-user
- subversion-users
- synapse-user
- tapestry-users
- tomcat-users
- tuscany-user
- wicket-users
- xerces-c-users
- xmlgraphics-fop-users