/bigdata-cluster

BigData Cluster with Docker

Primary LanguageShellApache License 2.0Apache-2.0

BigData Cluster

Off-The-Shelf solution to run a cluster for Big Data with docker.

Software installed:

Prerequisites

Before to start, you have to had the following softwares to be installed on all your machines:

Installing Docker Engine

https://docs.docker.com/engine/installation/

Setup

Network

Run the following command on your master pc.

$ sudo docker swarm init --advertise-addr <ip of your master>

Then, make every node of your cluster, to join the docker swarm cluster.

$ sudo docker swarm join --token <secret token> <ip of your master>

Now that you have all the nodes connected to the swarm, create a network overlay on your master.

$ sudo docker network create --attachable --driver overlay --subnet 10.0.1.0/24 hadoop_cluster

DNS Server

This container will run Serf and dnsmasqd.

Serf is tool for cluster membership, failure detection, and orchestration.

dnsmasqd is a lightweight, easy to configure, DNS forwarder.

This container will serve to the cluster the functionality of resolving internal hostnames and detecting when a new slave joins the cluster.

$ sudo docker build -t dns:latest -f Dockerfile-DNS .

Run the container

$ sudo docker run -d -ti --name dns --add-host master:10.0.1.2 --hostname cluster-dns --ip 10.0.1.254 --network hadoop_cluster -e TZ=Europe/Rome <image id> bash -c "/root/boot_dns.sh"

Master

Build the image

$ sudo docker build -t hadoop:latest .

Run the container

$ sudo docker run -d -ti --name master -p 54311:54311 -p 50070:50070 -p 9000:9000 -p 8030:8030 -p 8031:8031 -p 8032:8032 -p 8033:8033 -p 8088:8088 -p 2122:22 --add-host master:10.0.1.2 --add-host cluster-dns:10.0.1.254 --hostname master --ip 10.0.1.2 --dns 10.0.1.254 --network hadoop_cluster -e TZ=Europe/Rome <image id> bash -c "/root/boot_master.sh"

Slaves

Build the image

$ sudo docker build -t hadoop:latest .

Run the container

$ sudo docker run -d -ti --name slave --add-host master:10.0.1.2 --add-host cluster-dns:10.0.1.254 --dns 10.0.1.254 --network hadoop_cluster -e TZ=Europe/Rome <image id> bash -c "/root/boot_slave.sh"