/sculptor

Primary LanguageJavaApache License 2.0Apache-2.0

Sculptor is a framework and command line interface that makes using Apache HBase much easier.

== Features ==
- Human friendly output
- Interactive command line interface
- Run ad-hoc MapReduce over HBase
- Customized client framework

Please see the doc at the wiki page.


== Pre-requirements ==
- Apache Ivy is required to build Sculptor
- Apache HBase installed
- Hadoop MapReduce (Sculptor MapReduce mode only)


== Build ==
$ cd sculptor
$ ant 


== Run Sculptor ==
1. Copy sculptor folder to your HBase client

2. Edit $SCULPTOR_HOME/sculptor, change HBASE_HOME variable to fit your environment

3. Initialize the sample
   $ sh $SCULPTOR_HOME/sculptor sample

4. Start Sculptor
   $ sh $SCULPTOR_HOME/sculptor
   sculptor> get from sc_item_data  item_id>100 limit=1
   sculptor> help

5. Use Sculptor for your own HBase tables
   Extend HEntity to create an entity class to represent your table
   Extend HClient, implement the abstract methods to handle your table access
   Package your classes, drop your jar under $SCULPTOR_HOME/auxlib directory
   Add one line to $SCULPTOR_HOME/conf/tables about your table
   See $SCULPTOR_HOME/sample for details


== Sculptor MapReduce ==
1. For Sculptor MapReduce mode only, get your environment ready to run customized MapReduce jobs over HBase.
   This usually requires the following steps:
     - Add HBase and ZooKeeper jars to Hadoop classpath by:
       $ echo "export HADOOP_CLASSPATH=$HBASE_HOME/hbase-0.9x.jar:$HBASE_HOME/lib/zookeeper-3.x.y.jar" >> hadoop-env.sh
     - Let Hadoop know where HBase is running
       $ ln -s $HBASE_HOME/conf/hbase-site.xml $HADOOP_HOME/conf/hbase-site.xml
     - Add your jar file (e.g. auxlib/sculptor-sample.jar) to Hadoop classpath.
       The easiest way to do this is to drop your jar file to $HADOOP_HOME/lib directory.
     - Sync the above changes across the cluster
     - Restart Hadoop MapReduce

2. Run ad-hoc MapReduce via Sculptor
   sculptor> set process mapreduce
   sculptor> get from sc_item_data item_id>100