Make sure you have installed all of the following prerequisites on your development machine
- Git - Download Git
- Docker - Get Docker
- Python 3.7.7
- Apache Spark 3.0.1
- Apache Hadoop 3.2
Start docker container
cd [WORK_DIRECTORY]/learning-apache-spark/deployments/local/
docker-compose up -d
Start history server and worker server
sh scripts/start.sh
Execute spark application
docker exec -it spark-master bash
/spark/bin/spark-submit /root/main.py