/spark-jupyter-env-docker

A spark jupyter notebok environment with scala, python and R kernel

Primary LanguagePLpgSQLMIT LicenseMIT

A docker-compose with scala, python and spark and jupiterlab

CodeFactor

A local dev environment for Scala, Spark, Jupyter notebook, vault, postgres and minio

Created with ❤️

This project helps you to create a big data stack with

  • Spark
    • 1 master node
    • 2 worker
  • Jupyterlab
    • Python
    • Scala
    • R
  • Postgres

Logical architecture

Architecture

Detailed architecture

Detailed architecture

Based on

How to start

Clone the repository

git clone https://github.com/raphaelmansuy/spark-jupyter-env-docker

Enter the project directory

cd spark-jupyter-env-docker

Build the images

./build.sh

Start the stack

docker-compose up --build

The stack is running 🎉 🚀

Open Jupiterlab

open http://localhost:8888

Jupiterlab

Open SparkUI

Open http://localhost:8080

Access to minio (high-performance, S3 compatible object storage)

Open http://localhost:9001

🧓 user: minioadmin 🔐 password: minioadmin

Access to vault (Secret manager)

Open http://127.0.0.1:8200

🔐 token myrootid

To delete the stack and destroy volumes

💣 This instruction delete all the containers and their volumes

docker-compose down --volumes

Voilà 🚀