This is to test an implementation of data processing within a Kafka pipeline. The Docker set up will spin up a cluster and all the monitoring components. The Python scripts will help to create the connections, maintain topics, and manage the data processing.
injection.py
-- this is the "main" tool. It will set up a listener for incoming messages in the configured topic inconfig.py
.- To generate events you can use
flog
to emit logs and pipe them toscream.py
which will output them to the correct topic configured inconfig.py
. E.g.,flog -f rfc3164 -l | python scream.py
- To test consuming you can run
output.py
I used this Github as a jumping off point: https://github.com/papirosko/kafka-demo
You can see all the configuraiton settings here: docker-compose.yml
.
I'm usng flog to generate logs. I'm including networking in the process for the testing (since this project isn't about throughput performance beyond "whatever is reasonable").
Yep! It's slow, running at around 20%-25% of realtime. See stats.ipynb
for some details.
See details in docs.