Internal:Hbase

From Brown University Robotics

(Difference between revisions)
Jump to: navigation, search
(Instructions to Install Hbase on maria)
(Install Thrift)
Line 83: Line 83:
Install:
Install:
   sudo make install
   sudo make install
 +
After installing thrift, there should be a system-wide "thrift" command available, which should provide some usage information.
 +
If problems occur during installation, follow the [http://wiki.apache.org/thrift/ThriftInstallation Thrift Installation] instructions.
-
 
+
The HBase Thrift API is described in the following file: [HBASE_HOME]/src/java/org/apache/hadoop/hbase/thrift/Hbase.thrift
-
If problems occur, follow the [http://wiki.apache.org/thrift/ThriftInstallation Thrift Installation] instructions.
+
-
 
+
ip hard coded in ThriftServer.java
ip hard coded in ThriftServer.java

Revision as of 22:23, 5 May 2010

Contents

The HBase Server

To Start Hbase on maria

If you need to restart Hbase on maria, follow these instructions. Currently Hbase is installed in bkorel's home directory. The start scripts are have global read and execute permissions, so anyone should be able to restart Hbase.

 cd /home/bkorel/HBase/src/hbase-0.20.3

Start the Hbase server with the below command. You will be prompted twice for your password to localhost.

 bin/start-hbase.sh

Start the thrift server:

 bin/hbase-daemon.sh start thrift --port=9091

If you do not specify the port via the command line, the thrift server will use the default port of 9090. However if you get the error "Could not create ServerSocket on address /192.168.0.1:9090" when starting the thrift server, you need to specify a different port number as in the above command. Note: if the thrift server is started using a different port, 9091 is currently hard coded in the Hbase ros node (bin/record) and should be changed.

To start the Hbase shell:

 bin/bhase shell

Example Hbase shell usage:

 hbase> # Type "help" to see shell help screen
 hbase> help
 hbase> # To create a table named "mylittletable" with a column family of "mylittlecolumnfamily", type
 hbase> create "mylittletable", "mylittlecolumnfamily"
 hbase> # To see the schema for you just created "mylittletable" table and its single "mylittlecolumnfamily", type
 hbase> describe "mylittletable"
 hbase> # To add a row whose id is "myrow", to the column "mylittlecolumnfamily:x" with a value of 'v', do
 hbase> put "mylittletable", "myrow", "mylittlecolumnfamily:x", "v"
 hbase> # To get the cell just added, do
 hbase> get "mylittletable", "myrow"

In case the logging table is deleted or inaccessible, run the following command to create the appropriate table and column families for storing data using the Hbase ros node:

 hbase> create "session_table", "timestamp", "msg", "topic"

To Stop Hbase

Hbase needs to be properly shut down; currently there are problems accessing previously stored data otherwise. Run the following two commands to stop the thrift and Hbase servers:

 bin/hbase-daemon.sh stop thrift
 bin/stop-hbase.sh

Instructions to Install Hbase on maria

Note: root permissions are required (for installing Thrift).

Hbase is currently run in standalone, non-distributed mode.

Install ZooKeeper

ZooKeeper is a high-performance coordination service for distributed applications. HBase depends on ZooKeeper because HBase keeps the location of its root table, who the current master is, and what regions are currently participating in the cluster in ZooKeeper. By default HBase manages a single ZooKeeper instance for you. In standalone and pseudo-distributed modes this is usually enough, but for fully-distributed mode you should configure a ZooKeeper quorum.

Download and unpack a stable ZooKeeper release. Currently 3.2.2 is installed.

bkorel: determine if writing a conf file is necessary for standalone mode.

If you run into problems, follow ZooKeeper Getting Started.

Install Hadoop

Hadoop is a distributed computing platform. If running in standalone mode, you will not need to install/configure or run Hadoop because HBase by default uses the local filesystem. Distributed modes require an instance of the Hadoop Distributed File System (DFS).

Download and unpack a stable Hadoop release.

Follow the Hadoop Quick Start instructions.

See the Hadoop requirements and instructions for how to set up a DFS. Before starting HBase you need to start the Hadoop DFS daemons and stop the daemons after HBase has shut down.

Install Hbase

HBase provides Bigtable-like structured storage. The following are instructions for standalone mode. For pseudo- or fully-distributed modes, following the HBase Getting Started instructions.

Download and unpack a stable HBase release. Currently 0.20.3 is installed.

cd to the HBase root directory and open conf/hbase-env.sh. Edit this file to set JAVA_HOME to point at the root of the Java installation:

 export JAVA_HOME=/usr

Test HBase by starting the server and shell. Be sure to stop HBase when done.

 bin/start-hbase.sh
 bin/hbase shell
 bin/stop-hbase.sh

Install Thrift

Thrift is a software framework for cross-language services development. Since the HBase API is in Java, and we would like to write ros nodes in Python, thrift is necessary to interface Python clients to HBase.

Download and unpack a stable Thrift release. Currently 0.2.0 is installed.

Requirements: Ruby 1.8+

 sudo apt-get install ruby1.8-dev

If installing on maria again, you will not need to run this. However if installing on a new machine and compiling thrift gives an "extconf.rb:20:in `require': no such file to load -- mkmf (LoadError)" error, the mkmf.rb library file is missing. This can be obtained from the ruby1.8-dev package by running the above command.

From the root thrift directory, configure thrift:

 ./configure

Compile thrift:

 make

Install:

 sudo make install

After installing thrift, there should be a system-wide "thrift" command available, which should provide some usage information.

If problems occur during installation, follow the Thrift Installation instructions.

The HBase Thrift API is described in the following file: [HBASE_HOME]/src/java/org/apache/hadoop/hbase/thrift/Hbase.thrift

ip hard coded in ThriftServer.java

Thrift API

Logging in HBase

If logging data for the first time, add the following line to your .bashrc file:

 export PYTHONPATH="/usr/lib/python2.6/site-packages"

Otherwise you will get the following error when running the Hbase ros node: "ImportError: No module named thrift"

Logging ROS messages in HBase is very easy. Check out the hbase ros node currently in the experimental section of brown-ros-pkg. In the bin directory is a script called record which takes two or more arguments. The first id is a session-id which is simply a string used to retrieve your session later. The session-id must be unique, if not you will be prompted to enter a new session-id. The subsequent arguments are the names of the topics you wish to record. It searches for the exact topic, so be careful with capitalization and remember all topic names start with /. When you're finished recording simply hit Ctrl-C, and your data is logged in the repository. Currently there can be issues with data loss if the server is shut down improperly, so be careful.

Example

The following command logs the four topics /headF /cmd_Larm /cmd_Rarm /blobs under the session-id 3simplex1.

 ./record 3simplex1 /headF /cmd_Larm /cmd_Rarm /blobs

Retrieving Data from HBase

Browsing the Table

Retrieving and Filtering Data