Tuesday, August 27, 2013

Hbase Setup in Ubuntu


Apache HBase is the Hadoop database, which is a distribute, sacalable, big data store. The following is some steps of configuring and using HBase as a new user guide.

1) Firstly, download latest version of HBase from official site: http://hbase.apache.org/. A binary version is preferred.

2) Unpack the compressed tar file, and start to config it. The most important things of configuring hbase includes:
i) Change conf/hbase-site.xml to look like this:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>file:///DIRECTORY/hbase</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/DIRECTORY/zookeeper</value>
  </property>
</configuration>

while, DIRECTORY is your local directory, which you want to store the data, a sample name could be like this: /home/yourname/database/hbase. And also, you can set up the directory for zookeeper, but that's not mandatory.

ii) For ubuntu users, or even linux users, also you should change your hosts a little bit. As the tutorial in the official site says: "HBase expects the loopback IP address to be 127.0.0.1. Ubuntu and some other distributions, for example, will default to 127.0.1.1 and this will cause problems for you". A modified hosts file possibly may look like this:

127.0.0.1 localhost
127.0.0.1 yourname
127.0.1.1 yourname
# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
"yourname" means your user name in ubuntu.  If you do not modify the hosts file in /etc, then possibly, you may encounter a problem when you start to use hbase shell: HBase will give you a error message says "ERROR: org.apache.hadoop.hbase.PleaseHoldException: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing". The loopback IP address was set up correctly, which may cause the master to stuck in the initialization process.

3) Then you can simply launch HBase by using: ./bin/start-hbase.sh. The master will be running, you can check it by looking at running process in your ubuntu. You can also terminate HBase by using: ./bin/stop-hbase.sh

4) Moreover, you can also use HBase shell to create new tables or operate on existed tables by using: ./bin/hbase shell. But firstly, you should launch HBase master by using the command in step 3), otherwise, HBase shell is not gonna work.



Friday, August 23, 2013

Just Do It

Sometime, something, you just need to make the first step, then everything should be much easier for you.

Thursday, August 22, 2013

About Me

About me:


I am now a post-doc in the University of California, San Diego, since Aug. 2013. Before joining Opera group, I graduated from Institution of Computing Technology, Chinese Academy of Sciences with PhD in computer science. I ever studied in the University of Science and Technology of China with a major in EE from 2003 to 2007.

My interests lie in many fields, including: advanced compiler, program analysis, data mining, machine learning, web development. Including C, C++, Java, python, perl are used during my daily development. I am also interested in big data analysis, for instance, effectively and efficiently collecting and analyzing people connection information from the internet.