This repository will allow you to run a single-node deployment of the Cloudera open-source distribution using Docker.
To maximise docker experience, this project takes advantage of docker-machine
and docker-compose
tools.
The only requirement here are Docker stack and VirtualBox
Even though this step is optional, it's recommended to create a separate docker machine
for Cloudera, given how many resources it consume.
docker-machine create cloudera -d virtualbox --virtualbox-cpu-count 2 --virtualbox-memory 6144
This command will create a boot2docker
6GB memory and 2 CPU Virtual Machine. Please adjust resource allocation based on your needs (Recommended memory allocation is 8GB).
Add an entry to your host file, so you can access GUI from your browser:
echo -e "$(docker-machine ip cloudera)\tquickstart.cloudera" | sudo bash -c 'cat >> /etc/hosts'
Attach docker client to your terminal:
eval "$(docker-machine env cloudera)"
Please run your docker-compose file:
docker-compose up -d
Hue dashboard: http://quickstart.cloudera:8888/
Sample hadoop command for listing files on a root level, that can be run from your local terminal:
hdfs dfs -ls hdfs://quickstart.cloudera/
Please note, that you need to have a local hadoop installation.
OSX users can install it via brew install hadoop
- Break down Cloudera image to separate services to take advantage of
docker-copmose
tool.