/docker-hadoop

Simple functional examples of running Hadoop + Hive in Docker with Docker Compose

Primary LanguageShell

Docker for Hadoop

Then inspiration for this project came from my need to quickly spin up functional Hadoop clusters for testing things like my Hive JDBC Driver project. These images are not intended to be production ready Hadoop clusters although I'm quite certain they could be extended to get there.

Many thanks to projects like https://github.com/big-data-europe/docker-hadoop and various forks for a starting place and ideas.

I'm currently maintaining 3 Hadoop/Hive configurations:

Cleaning up Docker

You can easily delete previously download images for these examples with the following command:

docker images -a | grep "timveil" | awk '{print $3}' | xargs docker rmi -f

or, for a complete system prune (this is what I usually do):

docker system prune -a -f --volumes --filter "label=maintainer=tjveil@gmail.com"

Remove all stopped containers:

docker ps -aq --no-trunc -f status=exited | xargs docker rm

Remove all dangling images:

docker images -q --filter dangling=true | xargs docker rmi