MLOps NYC 2019 training session: Runnign Spark on Kubernetes. This setup will be discussed at the MLOps NY conference on September 24th 2019. http://mlopsnnyc.com
- Docker desktop with Kubernetes enabled
To run the demo configure Docker with 3 cpus and 4GB of ram
Make note of the location where you downloaded
From a Windows command line or terminal in Mac
kubetctl get pods
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml
You can stop it after running helm
kubectl proxy
For this setup download the Windows or Mac binary.
Extract and expand somewhere local.
Documentation: https://helm.sh/docs/
ALL binaries: https://github.com/helm/helm/releases
Windows Binary: https://get.helm.sh/helm-v3.0.0-beta.3-windows-amd64.zip
Go to the location where you downloaded the files from this repository kubectl apply -f spark-operator.json
Location of heml\helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator
Location of heml\helm install incubator/sparkoperator --generate-name --namespace spark-operator --set sparkJobNamespace=default
kubectl create clusterrolebinding add-on-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default
Get the Spark service account. Make not of the sparkoprator-xxxxxx-spark name
kubectl get serviceaccounts
Change the serviceAccount line value to the value you got in the previous command
You must be in the directory where you extracted this repository
kubectl apply -f spark-pi.yaml
Driver and workers show when running. You should see spark-pi-driver and one worker
kubectl get pods
List all Spark applications kubectl get sparkapplications
Detailed list in JSON format Watch state under status
kubectl get sparkapplications -o json
Watch the job execution
kubectl logs spark-pi-driver -f
Delete the application
kubectl delete -f spark-pi.yaml