Archive for August, 2009

Erlang, Thrift and HBase

Since documentation is lacking for Thrift and HBase (and Erlang), though the examples in Thrift’s code are pretty good, I figured as I got things going I’d post snippets. I assume you have Erlang installed, but I will suggest checking out Erlware for setting up Erlang and installing releases and applications.

First, download Thrift and HBase.

The Thrift instructions are below. However, note you need to install C++ boost libs and the Ruby development package (maybe more, but thats all I needed on my machine at least, hehe), I give only the Debian/Ubuntu way:

$ sudo apt-get install libboost-dev
$ sudo apt-get install ruby1.8-dev
$ cd ~/Desktop
$ svn co http://svn.apache.org/repos/asf/incubator/thrift/trunk thrift
$ cd thrift
$ ./bootstrap.sh
$ ./configure
$ make
$ make install

For HBase:

$ sudo apt-get install sun-java6-jdk
$ update-java-alternatives -s java-6-sun
$ sudo apt-get install ant

Download HBase, uncompress and edit hbase-env.sh to point to your Java and compile:

$ cd ~/Desktop
$ wget http://apache.imghat.com/hadoop/hbase/hbase-0.19.3/hbase-0.19.3.tar.gz
$ tar -zxvf hbase-0.19.3.tar.gz
$ cd hbase-0.19.3/
$ nano conf/hbase-env.sh
# The java implementation to use.  Java 1.6 required.
# export JAVA_HOME=/usr/java/jdk1.6.0/
export JAVA_HOME=/usr/lib/jvm/java-6-sun/jre/
$ ant

Copy Thrift’s Erlang libs to your Erlang lib dir:

$ sudo cp -R ~/Desktop/thrift/lib/erl /usr/lib/erlang/lib/thrift-0.1.0/

I just picked that name at “random”. I wish they had picked a standard and useful name with a version number and not named erl… But whatever.

Now you are able to use Thrifts Erlang modules. Next, we must get the HBase Thrift bindings created.

$ cd ~/Desktop/hbase-0.19.3/src/java/org/apache/hadoop/hbase/thrift/
$ thrift -gen erl Hbase.thrift

This creates gen-erl:

$ ls  gen-erl/
hbase_constants.hrl  hbase_thrift.erl  hbase_thrift.hrl  hbase_types.erl  hbase_types.hrl

Open up two new separate terminals, tabs, whatever, go back to where you compiled HBase and run:

$ cd ~/Desktop/hbase-0.19.3
$ ./bin/start-hbase.sh

$ cd ~/Desktop/hbase-0.19.3
$ ./bin/hbase thrift start
09/08/31 11:11:00 INFO ThriftServer: starting HBase Thrift server on port 9090

Finally, under the gen-erl directory do the following:

$ cd gen-erl
$ erlc *.erl
$ erl 
Erlang (BEAM) emulator version 5.6.5  [smp:2] [async-threads:0] [kernel-poll:false]
Eshell V5.6.5  (abort with ^G)
1> rr(hbase_types).
2> {ok, Client} = thrift_client:start_link("", 9090, hbase_thrift).
3> thrift_client:call(Client, getTableNames, []).
4> thrift_client:call(Client, createTable, ["test", [#columnDescriptor{name="test_col:"}]]).
5> thrift_client:call(Client, mutateRow, ["test", "test_row",
    [#mutation{isDelete=false,column="test_col:new", value="wooo"}]]).
6> thrift_client:call(Client, getTableNames, []).
7> thrift_client:call(Client, getRow, ["test", "test_row"]).
{ok,#tRowResult{row = <<"test_row">>,
 columns = {dict,1,16,16,8,80,48,

Here we are compiling the Erlang module generated by Thrift, loading up an Erlang shell with the current working directory added to the path to look for beam files. After we are in the shell we add the records contained in hbase_types.hrl to our running environment. Next, we connect to the running Thrift server and tell it its called hbase_thrift on port 9090 on our localhost. To understand the function calls I make after that dig into the Erlang header files and source files HBase generates. You’ll also need some understanding of HBase, which I don’t have time to go into here. But basically we create a table with one column family called test_col. Then, we insert a value into the column new of the column family test_col. Lastly, we get the row test_row we just added from the table.

You can go into the HBase shell and see all this now:

$ ./bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Version: 0.19.3, r, Mon Aug 31 11:55:38 CDT 2009
hbase(main):001:0> list
1 row(s) in 0.2660 seconds
hbase(main):002:0>scan 'test'
ROW                          COLUMN+CELL
test_row                    column=test_col:new, timestamp=1251739784969, value=wooo
1 row(s) in 0.0503 seconds

Read Full Post »

Checkout my post on the Erlware blog about using a simple one for one to handle DB requests from webmachine for a better tutorial.


I found the posts online about Erlang’s simple_one_for_one supervisors pretty lacking so I thought I’d throw something together about how I use them.

Using Nitrogen I have a chat site that communicates with an ejabberd backend. Each user gets a user_server process (a gen_server) that handles all the communication with ejabberd through the exmpp library from Process One. These are dynamic servers so they are unnamed and use simple_one_for_one supervisors for their supervision, which sit under the main applications supervisor.

The user_simple_one_for_one.erl looks basically like this:

start_link() ->
  supervisor:start_link(?MODULE, []).

init([]) ->
  UserSpec = {user_server, {user_server, start_link, []},
                            temporary, 2000, worker, [user_server]},
  StartSpecs = {{simple_one_for_one, 0, 1}, [UserSpec]},
  {ok, StartSpecs}.

Unlike a normal supervisor no child process is started after calling user_simple_one_for_one:start_link(). 0 and 1 define the number of restarts in 1 second and we use temporary for the worker since users come and go. Thus the server should shutdown on normal exit and not restart.  It order to start this child we call user_server:start(Pid). So in user_server.erl we need both start_link/2 and start/3.

start_link(UserName, Password) ->
  gen_server:start_link(?MODULE, [UserName, Password], []).

start(Super, UserName, Password) ->
  supervisor:start_child (Super, [UserName, Password]).

We then need to start the supervisor somewhere and store its Pid to be passed into a function that passes the supervisors Pid to the user_server for it to use to call start_child:

{ok, Pid} = user_simple_one_for_one:start_link().
jabber_login (Sup, UserName, Password) ->
  user_server:start(Sup, UserName, Password).

Now we have dynamic supervision of a dynamic gen_server.

Read Full Post »