/sparkapps-docker

Dockerized Apache Spark applications

Primary LanguageJava

Run Java-based Apache Spark applications with Docker

Introduction

This repo contains a few examples that show how to develop and run Apache Spark applications in a Docker environment.

The code is organized into a number of Maven submodules; please consult the respective README.md files to learn more.

List of examples

  1. Word count in Apache Spark
  2. Example with Spark SQL and Hive

Building the examples

Firstly, let's build the maven modules & docker images:

mvn clean package

Then verify that images have been created:

docker images

Implementation notes