If you don't need to develop but only need to start and showcase the application. You could simply just run the following script. It will build the necessary images needed and start a docker compose app.
./start.sh
You can also list all your local containers by the following command:
docker container ls
The example output will look like:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f951a4f7ed54 dedup-api:1.0.0 "uvicorn backend.src…" 33 seconds ago Up 31 seconds 0.0.0.0:8000->8000/tcp docker-dedup-api-1
19d2b215b148 ymurong/spark-master:3.3.0-hadoop3.3 "/bin/bash /master.sh" 9 minutes ago Up 31 seconds 0.0.0.0:7077->7077/tcp, 6066/tcp, 0.0.0.0:8080->8080/tcp spark-master
The UI Address and Service Address are exposed as the following:
# OpenAPI3 Playground UI
http://localhost:8000/docs
# Spark Application UI
http://localhost:8080
# Spark Master Address
http://localhost:7077
You can stop and remove Spark and API containers by running the following script:
./stop.sh
You must use python 3.10 as remote spark cluster worker only support python 3.10. The following example use python 3.10. Run the following commands from PROJECT ROOT.
pip3.10 install virtualenv
virtualenv venv --python=python3.10
Install Global Dependencies
source venv/bin/activate
pip install -r requirements.txt
This is useful for backend modules to locate packages
cd backend
pip install -r requirements.txt -e .
The spark UI is exposed on localhost:8080. Spark Master is exposed on 7077.
./docker/start-spark.sh
When you want to stop and remove the cluster
./docker/stop-spark.sh
source venv/bin/activate
python backend/src/run.py
The docs page gives access to API interface and allow us to directly test API through UI.
http://localhost:8000/docs