This project demonstrates a data pipeline that transfers data from Kafka to HBase and provides a web application to view the data stored in the HBase table.
Make sure you have Docker and Python installed on your system.
-
Navigate to the project directory:
cd your_repository
-
Build and start the Docker containers:
docker-compose up --build
-
First, produce data to Kafka by running the following command:
python3 produce_to_kafka.py
This script will read JSON files from the
data
directory and produce them to the Kafka topic. -
Next, consume data from Kafka and store it in HBase by running:
python3 consume_to_hbase.py
This script will create a table in HBase and add data to it by subscribing to the Kafka topic.
-
Finally, start the web application to view the data:
python3 test.py
This will launch a web app where you can see the data logged from the HBase table.