NOTE:

Still under-development.
Document may not be entirely correct
Some "moving" parts are not fully described or not even listed yet.

with Kafka, Zookeeper, Spark, CouchDB, FastAPI, Python, React and Chartjs

Here we are going to simulate a system using Kafka and Spark to examine its capability to handle continous data distribution, and at the same time testing how PouchDB and CouchDB can do auto-sync between frontend and backend

SCREENSHOTS

SETUP

There are 11 components:

Kafka Brokers
Zookeeper
CMAK as Kafka WebUI Manager
CouchDB
User Frontend Application
Admin Frontend Application
Kafka Consumer
Backend API as Kafka Producer
Spark Master
Spark Worker
Apache Zeppelin

Prerequisites:

Node, Npm
Make
Pyenv, Pipenv
Docker

Setting up:

Basicall, run Make setup to install all dependencies and creating a docker network

$ make setup

Given our CouchDB (must be exposed to localhost) with default Authentication config being used (refer to ./env/couchdb)

Add CORS to CouchDB so AdminApp can sync with it

$ make add_cors_couch

Alternatively, when running as a docker service, CORS can be enabled using GUI from http://localhost:5984/_utils/#_config/nonode@nohost. Note that CouchDB Cluster Config might need some manual adjustment

Using CMAK as Kafka-Manager to add Cluster then setup proper partition-assignment going to http://localhost:9000

Getting up

Run the whole system Use the following command to fire the system up (and optionally scale the services - eg: 2 (k)afka-brokers, 3 (s)park-worker)

NOTE: Each spark-executor will require 1 single-core from the CPU, so depend on your machine spec, you can have more or less of them.

$ make up-scale k=2 s=3

Submit the Spark application

If zookeeper and the brokers are ready to work, deploy the Spark app using.

$ make submit_job

As the job has been succesfully deployed, go to http://localhost:8080 and find the newly running application.

If you want to make any change to the job, just modify the scala codes in ./spark_job and re-run the above command.

Start the frontend applications Start frontend apps (UserApp and AdminApp) in 2 separated terminals, using 2 commands

$ make fe_user
$ make fe_admin

Sending data to Kafka Use the UserApp at http://localhost:3001 to send data continously (stream of numbers) to Backend-Producer.

Alternatively you can go tob http://localhost:8000/docs and use Swagger to make api request.

Watching changes from frontend AdminApp If the admin app is already running, go to http://localhost:3002 and see changes if there are any messages being streamed to CouchDB
Enjoy hacking on your own :)

TODO

Considering what to add to complete the Architechture

vutran1710/KafkaOffShore