This project demonstrates a simple Kafka consumer and producer setup for real-time BTC price prediction using a linear regression model.
This project consists of two main components:
- Kafka Consumer: Retrieves real-time BTC price data from a Kafka topic, uses a pre-trained linear regression model to predict the closing price, and calculates prediction accuracy.
- Kafka Producer: Reads historical BTC price data from a CSV file, converts it to JSON format, and produces messages to a Kafka topic.
Ensure you have the following dependencies installed before running the project:
- Python 3.x
- confluent_kafka
- pandas
- scikit-learn
You can install the required Python packages using the following command:
pip install confluent-kafka pandas scikit-learn
- Start the Kafka Producer: open new terminal window and run this code.
python Producer.py
- Start the Spark Consumer: open new terminal window and run this code.
python Consumer.py