/kafka-connect-hbase

Kafka Connect to Hbase

Primary LanguageJavaApache License 2.0Apache-2.0

Kafka Connect for Hbase

A Sink connector to write to HBase.
I have the source connector implementation available at https://github.com/mravi/hbase-connect-kafka

Pre-requisites

  • Confluent 3.2.x
  • Kafka 0.10.2.x
  • HBase 1.3.0
  • JDK 1.8

Assumptions

  • The HBase table already exists.
  • Each Kafka topic is mapped to a HBase table.

Properties

Below are the properties that need to be passed in the configuration file:

name data type required description
zookeeper.quorum string yes Zookeeper quorum of the HBase cluster
zookeeper.znode.parent string yes Zookeeper parent znode of the HBase cluster [default is /hbase]
event.parser.class string yes Can be either AvroEventParser or JsonEventParser to parse avro or json events respectively.
topics string yes list of kafka topics.
hbase.<topicname>.rowkey.columns string yes The columns that represent the rowkey of the hbase table <topicname>
hbase.<topicname>.family string yes Column family of the hbase table <topicname>.

Example connector.properties file

name=kafka-cdc-hbase
connector.class=io.svectors.hbase.sink.HBaseSinkConnector
tasks.max=1
topics=test
zookeeper.quorum=localhost:2181
zookeeper.znode.parent=/hbase
event.parser.class=io.svectors.hbase.parser.AvroEventParser
hbase.test.rowkey.columns=id
hbase.test.rowkey.delimiter=|
hbase.test.family=d

Packaging

  • mvn clean package

Deployment

mkdir $CONFLUENT_HOME/share/java/kafka-connect-hbase
cp target/hbase-sink.jar  $CONFLUENT_HOME/share/java/kafka-connect-hbase/
cp hbase-sink.properties $CONFLUENT_HOME/share/java/kafka-connect-hbase/
  • Start Zookeeper, Kafka and Schema registry
nohup $CONFLUENT_HOME/bin/zookeeper-server-start $CONFLUENT_HOME/etc/kafka/zookeeper.properties &
nohup $CONFLUENT_HOME/bin/kafka-server-start $CONFLUENT_HOME/etc/kafka/server.properties &
nohup $CONFLUENT_HOME/bin/schema-registry-start $CONFLUENT_HOME/etc/schema-registry/schema-registry.properties &"
  • Create HBase table 'test' from hbase shell

  • Start the hbase sink

export CLASSPATH=$CONFLUENT_HOME/share/java/kafka-connect-hbase/hbase-sink.jar

$CONFLUENT_HOME/bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-hbase/hbase-sink.properties
  • Test with avro console, start the console to create the topic and write values
$CONFLUENT_HOME/bin/kafka-avro-console-producer \
--broker-list localhost:9092 --topic test \
--property value.schema='{"type":"record","name":"record","fields":[{"name":"id","type":"int"}, {"name":"name", "type": "string"}]}'
#insert at prompt
{"id": 1, "name": "foo"}
{"id": 2, "name": "bar"}