From Brown University Robotics

Revision as of 23:36, 5 May 2010 by Bkorel (Talk | contribs)
Jump to: navigation, search


The HBase Server

To Start Hbase on maria

If you need to restart Hbase on maria, follow these instructions. Currently Hbase is installed in bkorel's home directory. The start scripts are have global read and execute permissions, so anyone should be able to restart Hbase.

 cd /home/bkorel/HBase/src/hbase-0.20.3

Start the Hbase server with the below command. You will be prompted twice for your password to localhost.


Start the thrift server:

 bin/ start thrift --port=9091

If you do not specify the port via the command line, the thrift server will use the default port of 9090. However if you get the error "Could not create ServerSocket on address /" when starting the thrift server, you need to specify a different port number as in the above command. Note: if the thrift server is started using a different port, 9091 is currently hard coded in the Hbase ros node (bin/record) and should be changed.

To start the Hbase shell:

 bin/bhase shell

Example Hbase shell usage:

 hbase> # Type "help" to see shell help screen
 hbase> help
 hbase> # To create a table named "mylittletable" with a column family of "mylittlecolumnfamily", type
 hbase> create "mylittletable", "mylittlecolumnfamily"
 hbase> # To see the schema for you just created "mylittletable" table and its single "mylittlecolumnfamily", type
 hbase> describe "mylittletable"
 hbase> # To add a row whose id is "myrow", to the column "mylittlecolumnfamily:x" with a value of 'v', do
 hbase> put "mylittletable", "myrow", "mylittlecolumnfamily:x", "v"
 hbase> # To get the cell just added, do
 hbase> get "mylittletable", "myrow"

In case the logging table is deleted or inaccessible, run the following command to create the appropriate table and column families for storing data using the Hbase ros node:

 hbase> create "session_table", "timestamp", "msg", "topic"

To Stop Hbase

Hbase needs to be properly shut down; currently there are problems accessing previously stored data otherwise. Run the following two commands to stop the thrift and Hbase servers:

 bin/ stop thrift

Instructions to Install Hbase on maria

Note: root permissions are required (for installing Thrift).

Hbase is currently run in standalone, non-distributed mode.

Install ZooKeeper

ZooKeeper is a high-performance coordination service for distributed applications. HBase depends on ZooKeeper because HBase keeps the location of its root table, who the current master is, and what regions are currently participating in the cluster in ZooKeeper. By default HBase manages a single ZooKeeper instance for you. In standalone and pseudo-distributed modes this is usually enough, but for fully-distributed mode you should configure a ZooKeeper quorum.

Download and unpack a stable ZooKeeper release. Currently 3.2.2 is installed.

If you run into problems, follow ZooKeeper Getting Started.

Install Hadoop

Hadoop is a distributed computing platform. If running in standalone mode, you will not need to install/configure or run Hadoop because HBase by default uses the local filesystem. Distributed modes require an instance of the Hadoop Distributed File System (DFS).

Download and unpack a stable Hadoop release.

Follow the Hadoop Quick Start instructions.

See the Hadoop requirements and instructions for how to set up a DFS. Before starting HBase you need to start the Hadoop DFS daemons and stop the daemons after HBase has shut down.

Install Hbase

HBase provides Bigtable-like structured storage. The following are instructions for standalone mode. For pseudo- or fully-distributed modes, following the HBase Getting Started instructions.

Download and unpack a stable HBase release. Currently 0.20.3 is installed.

cd to the HBase root directory and open conf/ Edit this file to set JAVA_HOME to point at the root of the Java installation:

 export JAVA_HOME=/usr

Test HBase by starting the server and shell. Be sure to stop HBase when done.

 bin/hbase shell

Install Thrift

Thrift is a software framework for cross-language services development. Since the HBase API is in Java, and we would like to write ros nodes in Python, thrift is necessary to interface Python clients to HBase.

Download and unpack a stable Thrift release. Currently 0.2.0 is installed.

From the root thrift directory, configure thrift:


Compile thrift:


If installing on maria again, you should not receive any compile errors. However if installing on a new machine and compiling thrift gives an "extconf.rb:20:in `require': no such file to load -- mkmf (LoadError)" error, the mkmf.rb library file is missing. This can be obtained from the ruby1.8-dev package by running the command:

 sudo apt-get install ruby1.8-dev


 sudo make install

After installing thrift, there should be a system-wide "thrift" command available, which should provide some usage information.

Export the python path so it can find the directory where the thrift packages are installed. Add the following line to your .bashrc file:

 export PYTHONPATH="/usr/lib/python2.6/site-packages"

If problems occur during installation, follow the Thrift Installation instructions.

Test Thrift

Let's write a python client to test thrift.

The HBase Thrift API is described in the following file: [hbase-root]/src/java/org/apache/hadoop/hbase/thrift/Hbase.thrift

Generate a thrift client package:

 cd [hbase-root]/src/java/org/apache/hadoop/hbase/thrift
 thrift --gen py Hbase.thrift

This will produce a directory called gen-py which contains a set of generated Python classes which will allow communication with the HBase thrift server automatically. Copy the hbase directory contained in the gen-py directory to a project directory of your choosing.

Create a file in the same directory:

 #!/usr/bin/env python
 import sys
 from thrift import Thrift
 from thrift.transport import TSocket
 from thrift.transport import TTransport
 from thrift.protocol import TBinaryProtocol
 from hbase import Hbase
 from hbase.ttypes import *
 host = 'localhost'
 port = 9091 # default thrift port is 9090
 # Make socket
 transport = TSocket.TSocket(host, port)
 # Buffering is critical. Raw sockets are very slow
 transport = TTransport.TBufferedTransport(transport)
 # Wrap in a protocol
 protocol = TBinaryProtocol.TBinaryProtocol(transport)
 client = Hbase.Client(protocol)
 print client.getTableNames()

The host is simply localhost, and the port should be whatever port the thrift server is started on.

Start HBase and thrift:

 bin/ start thrift --port=9091

Run the client:


This client will simply print a blank list ([]), unless tables have been created in HBase.

As an additional resource, follow this helpful tutorial

Logging in HBase

If logging data for the first time, add the following line to your .bashrc file:

 export PYTHONPATH="/usr/lib/python2.6/site-packages"

Otherwise you will get the following error when running the Hbase ros node: "ImportError: No module named thrift"

Logging ROS messages in HBase is very easy. Check out the hbase ros node currently in the experimental section of brown-ros-pkg. In the bin directory is a script called record which takes two or more arguments. The first id is a session-id which is simply a string used to retrieve your session later. The session-id must be unique, if not you will be prompted to enter a new session-id. The subsequent arguments are the names of the topics you wish to record. It searches for the exact topic, so be careful with capitalization and remember all topic names start with /. When you're finished recording simply hit Ctrl-C, and your data is logged in the repository. Currently there can be issues with data loss if the server is shut down improperly, so be careful.


The following command logs the four topics /headF /cmd_Larm /cmd_Rarm /blobs under the session-id 3simplex1.

 ./record 3simplex1 /headF /cmd_Larm /cmd_Rarm /blobs

Retrieving Data from HBase

Browsing the Table

Retrieving and Filtering Data