docker-hadoop-hive-parquet

This project will showcase how to spin up a Hadoop cluster with Hive in order to run SQL queries on Parquet files. Images for the nodes are based on https://hub.docker.com/u/bde2020 base images.

All of this makes more sense if you follow the link in the repository to the article on Medium :)

Run

Swarm

  1. docker swarm init
  2. docker stack deploy --compose-file docker-compose.yml hadoop