This repository is part of a project in the course Real-Time Big Data Processing at unibz.
This project requires
docker
anddocker-compose
python 3.7
yarn
andnode 16
All setup instructions assume a *nix system, preferably some Linux variant.
Copy the example environment file. Most likely it does not need to be modified:
cp .env.example .env
First, start all all the docker containers for kafka and flink: docker-compose up -d
.
The producer and processor can run using the same python interpreter and virtual environment.
# Create virtual environment
$ python3.7 -m venv venv
$ source venv/bin/activate
# Install dependencies
$ pip install -r requirements.txt
NOTE: It's important to run the producer before the processor, as it sets up the kafka topic used for producing.
Run the producer:
$ python producer/producer.py
Run the processor (in another shell):
$ pytbon processor/processor.py
For the processor, you also need the jar for the kafka sql connector for flink. The latest version can be found here.
As of writing, version 1.15.0
is used
Open another shell to run the front-end beside the stream processor
$ cd webapp
To use the google maps API, you need an API from google cloud. Use the .env template and insert the
API key:
cp .env.example .env.local
and fill in the API key in the .env.local
file.
First, run the kafka-proxy
. This application will create a WebSocket proxy to kafka that we can
use to access the kafka topic directly from the browser.
$ node kafka-proxy.js
Then, start the webapp
# Install dependencies
$ yarn
# Run the web application
$ yarn dev
# The application should now be running on http://localhost:3000