/spark-k8s-reverse-proxy

Yet Another Spark UI reverse proxy

Primary LanguageGoMIT LicenseMIT

Spark UI for Kubernetes

CICD Docker Image Version Docker Pulls

This is a simple Spark UI reverse proxy to ease accessing the UI when working with Kubernetes.

Screenshots

Home Page :

Driver's Logs Page :

Driver's Manifest Page :

Architecture

The reverse proxy is deployed as an application inside the kubernetes cluster and can route traffic from an ingress to the spark driver ui.

Setup

The reverse proxy relies on label selection to list spark drivers, thus you need to add the following label depending on submission mode :

  • In client mode: you need to add the label spark-role=driver and expose port 4040
  • In cluster mode: all labels are already added by default

In the spark submit command, you need to enable reverse proxy as follows :

/opt/spark/bin/./spark-submit \
    --master k8s://https://kubernetes.default.svc:443 \
    --deploy-mode client \
    --name $JOB_NAME \
    ...\
    --conf spark.ui.reverseProxy=true \
    file:///opt/spark/examples/jars/spark-examples_2.12-3.5.0.jar "$1"

Usage

Docker

TBC

docker pull helkaroui/spark-reverse-proxy:latest

Kubectl

TBC

Helm

To install using Helm chart :

# Clone the repository then run :
helm install my-release ./helm

Development

This project requires :

  • skaffold
  • kustomize
  • kind (for testing on kubernetes cluster)

To install these dependencies, run the following commands

# macos
brew install skaffold kustomize kind

# linux
apt install skaffold kustomize kind

To start modifying the source code, read the developers guide.