A network spark_nw
will be created. Each node connects to this network. Only
the master node will be in interactive mode (started with -i
flag).
For now the network creation and the creation of the worker nodes is done manually, something that I plan to automate using Docker compose.
You might want to adjust the ports depending on your setup. The are currently prefixed with 3
:
# in start-master.sh
-p 38080:8080 \
-p 34040:4040 \
-p 37077:7077 \
Internally the ports are the default ones.
# Start docker daemon
sudo docker daemon
# Build this image
./build-image.sh
Note that the version depends on git and the current status. Commit your changes before building, or leave it dirty on all times...
# To setup a private docker network for the cluster
./create-network.sh
# Create master node (the name is important)
./start-master.sh
# Create n worker nodes (repeat this step n-times)
./start-worker.sh
If you want to test your new cluster, you can use
./tests/pytest.sh
Make sure that you installed Spark locally as well, and that the SPARK_HOME
environment flag is set correctly.
You might want to adapt the path of the to-be committed jar file, depending on your Spark environment (I build it myself).
Thanks to: