From Brown University Robotics
Revision as of 22:35, 5 May 2010 by Bkorel
The HBase Server
To Start Hbase on maria
If you need to restart Hbase on maria, follow these instructions. Currently Hbase is installed in bkorel's home directory. The start scripts are have global read and execute permissions, so anyone should be able to restart Hbase.
Start the Hbase server with the below command. You will be prompted twice for your password to localhost.
Start the thrift server:
bin/hbase-daemon.sh start thrift --port=9091
If you do not specify the port via the command line, the thrift server will use the default port of 9090. However if you get the error "Could not create ServerSocket on address /192.168.0.1:9090" when starting the thrift server, you need to specify a different port number as in the above command. Note: if the thrift server is started using a different port, 9091 is currently hard coded in the Hbase ros node (bin/record) and should be changed.
To start the Hbase shell:
Example Hbase shell usage:
hbase> # Type "help" to see shell help screen hbase> help hbase> # To create a table named "mylittletable" with a column family of "mylittlecolumnfamily", type hbase> create "mylittletable", "mylittlecolumnfamily" hbase> # To see the schema for you just created "mylittletable" table and its single "mylittlecolumnfamily", type hbase> describe "mylittletable" hbase> # To add a row whose id is "myrow", to the column "mylittlecolumnfamily:x" with a value of 'v', do hbase> put "mylittletable", "myrow", "mylittlecolumnfamily:x", "v" hbase> # To get the cell just added, do hbase> get "mylittletable", "myrow"
In case the logging table is deleted or inaccessible, run the following command to create the appropriate table and column families for storing data using the Hbase ros node:
hbase> create "session_table", "timestamp", "msg", "topic"
To Stop Hbase
Hbase needs to be properly shut down; currently there are problems accessing previously stored data otherwise. Run the following two commands to stop the thrift and Hbase servers:
bin/hbase-daemon.sh stop thrift bin/stop-hbase.sh
Instructions to Install Hbase on maria
Note: root permissions are required (for installing Thrift).
Hbase is currently run in standalone, non-distributed mode.
ZooKeeper is a high-performance coordination service for distributed applications. HBase depends on ZooKeeper because HBase keeps the location of its root table, who the current master is, and what regions are currently participating in the cluster in ZooKeeper. By default HBase manages a single ZooKeeper instance for you. In standalone and pseudo-distributed modes this is usually enough, but for fully-distributed mode you should configure a ZooKeeper quorum.
Download and unpack a stable ZooKeeper release. Currently 3.2.2 is installed.
bkorel: determine if writing a conf file is necessary for standalone mode.
If you run into problems, follow ZooKeeper Getting Started.
Hadoop is a distributed computing platform. If running in standalone mode, you will not need to install/configure or run Hadoop because HBase by default uses the local filesystem. Distributed modes require an instance of the Hadoop Distributed File System (DFS).
Download and unpack a stable Hadoop release.
Follow the Hadoop Quick Start instructions.
See the Hadoop requirements and instructions for how to set up a DFS. Before starting HBase you need to start the Hadoop DFS daemons and stop the daemons after HBase has shut down.
HBase provides Bigtable-like structured storage. The following are instructions for standalone mode. For pseudo- or fully-distributed modes, following the HBase Getting Started instructions.
Download and unpack a stable HBase release. Currently 0.20.3 is installed.
cd to the HBase root directory and open conf/hbase-env.sh. Edit this file to set JAVA_HOME to point at the root of the Java installation:
Test HBase by starting the server and shell. Be sure to stop HBase when done.
bin/start-hbase.sh bin/hbase shell bin/stop-hbase.sh
Thrift is a software framework for cross-language services development. Since the HBase API is in Java, and we would like to write ros nodes in Python, thrift is necessary to interface Python clients to HBase.
Download and unpack a stable Thrift release. Currently 0.2.0 is installed.
Requirements: Ruby 1.8+
sudo apt-get install ruby1.8-dev
If installing on maria again, you will not need to run this. However if installing on a new machine and compiling thrift gives an "extconf.rb:20:in `require': no such file to load -- mkmf (LoadError)" error, the mkmf.rb library file is missing. This can be obtained from the ruby1.8-dev package by running the above command.
From the root thrift directory, configure thrift:
sudo make install
After installing thrift, there should be a system-wide "thrift" command available, which should provide some usage information.
If problems occur during installation, follow the Thrift Installation instructions.
Write a Python Client to Test Thrift
The HBase Thrift API is described in the following file: [hbase-root]/src/java/org/apache/hadoop/hbase/thrift/Hbase.thrift
Generate a thrift client package:
cd [hbase-root]/src/java/org/apache/hadoop/hbase/thrift thrift --gen py Hbase.thrift
This will produce a directory called gen-py which contains a set of generated Python classes which will allow communication with the HBase thrift server automatically. Copy the hbase directory contained the gen-py directory to a project directory of your choosing.
Create a client.py file in the same directory:
#!/usr/bin/env python import sys from thrift import Thrift from thrift.transport import TSocket from thrift.transport import TTransport from thrift.protocol import TBinaryProtocol from hbase import Hbase from hbase.ttypes import *
host = '192.168.0.1' port = 9091 # default thrift port is 9090 # Make socket transport = TSocket.TSocket(host, port) # Buffering is critical. Raw sockets are very slow transport = TTransport.TBufferedTransport(transport) # Wrap in a protocol protocol = TBinaryProtocol.TBinaryProtocol(transport) client = Hbase.Client(protocol) transport.open() print client.getTableNames()
Logging in HBase
If logging data for the first time, add the following line to your .bashrc file:
Otherwise you will get the following error when running the Hbase ros node: "ImportError: No module named thrift"
Logging ROS messages in HBase is very easy. Check out the hbase ros node currently in the experimental section of brown-ros-pkg. In the bin directory is a script called record which takes two or more arguments. The first id is a session-id which is simply a string used to retrieve your session later. The session-id must be unique, if not you will be prompted to enter a new session-id. The subsequent arguments are the names of the topics you wish to record. It searches for the exact topic, so be careful with capitalization and remember all topic names start with /. When you're finished recording simply hit Ctrl-C, and your data is logged in the repository. Currently there can be issues with data loss if the server is shut down improperly, so be careful.
The following command logs the four topics /headF /cmd_Larm /cmd_Rarm /blobs under the session-id 3simplex1.
./record 3simplex1 /headF /cmd_Larm /cmd_Rarm /blobs
Retrieving Data from HBase
Browsing the Table
Retrieving and Filtering Data