Building streaming event pipeline around Apache Kafka and its ecosystem (REST Proxy, Kafka Connect), that allows to simulate and display the status of train lines in real time.
You are constructing a streaming event pipeline around Apache Kafka and its ecosystem. Using public data from the Chicago Transit Authority you will construct an event pipeline around Kafka that allows us to simulate and display the status of train lines in real time. When the project is completed, you will be able to monitor a website to watch trains move from station to station.
- Install Docker, make sure Docker Compose is installed too
- If you are on Windows machine, install: Windows Subsystem for Linux (WSL) version 2 link
- Install Ubuntu 20.04
- If you are on Windows machine, also install
librdkafka
library link, make sure to install what is needed from this link too - Inside Ubuntu, in a terminal instance run
docker-compose up
- You can check status of the environment by running
docker-compose ps
in a new terminal instance
- You can check status of the environment by running
There are 2 pieces of the simulation, producer
and consumer
, each of them can be run separately. To run end-to-end simulation, run all the pieces together(in different terminal windows):
-
To run the
producer
:cd producers
virtualenv venv
. venv/bin/activate
pip install -r requirements.txt
python simulation.py
HitCtrl+C
at any time to exit.
-
To run the Faust Stream Processing Application:
cd consumers
virtualenv venv
. venv/bin/activate
pip install -r requirements.txt
faust -A faust_stream worker -l info
-
To run the KSQL Creation Script:
cd consumers
virtualenv venv
. venv/bin/activate
pip install -r requirements.txt
python ksql.py
-
To run the
consumer
:cd consumers
virtualenv venv
. venv/bin/activate
pip install -r requirements.txt
cpython server.py
HitCtrl+C
at any time to exit.
- To stop Docker run
docker-compose stop
in a terminal instance - To clean up the containers to reclaim disk space, run
docker-compose rm -v