/spark-history-server-docker

Docker image for Spark history server on Kubernetes

Primary LanguageShellApache License 2.0Apache-2.0

Spark History Server Docker Image

This repo contains the Dockerfile and associated dependencies to build the image for the Spark history server Helm chart. Spark history server does not require a separate image other than an image that contains a Spark build. But extra dependencies need to be baked in the image in order to enable the history server to communicate with Google Cloud Storage, Azrue Blob Storage or AWS S3, if the user chooses to use one of those options as backend storage.

The Docker image corresponding to this repo is lightbend/spark-history-server. It's also the default image used in the Helm chart. Feel free to build your own image with your custom build of Spark or dependencies. The Helm chart also supports setting image.repository and image.tag to install the chart with your custom image.

Google Cloud Storage

The Cloud Storage connector is included in the image to enable the history server to read from Spark event logs in GCS.

S3

The hadoop-aws is included in the image to support AWS S3 integration

Azrue Blob Storage

The hadoop-azure is included in the image to support Azure Blob Storage integration