GeoDocker Cluster
Docker containers with prepared environment to run GeoTrellis, GeoMesa, and GeoWave jobs. These images will create a set of containers, running in a distributed fashion, on a single machine. In practice, this requires being careful to ensure that enough memory is available for all images.
Environment
- Hadoop (HDFS + YARN) 2.7.1
- ZooKeeper 3.4.6
- Accumulo 1.6.x / 1.7.x (the specific version to use is configurable)
- Spark 1.5.2 (Scala 2.10 / Scala 2.11)
Repository short description (index of ReadMe docs)
Base images:
-
- Contains a Dockerfile to build an image with Hadoop, ZooKeeper, Accumulo and Spark installed (but not configured).
- Available on Dockerhub:
-
- Contains a cluster with a ZooKeeper node, working in a singlenode mode.
- Available on Dockerhub:
- Dockerhub images tags description:
- 0.1.0 - contains Accumulo 1.6.5 (2gb configuration) and Spark 1.5.2 (Scala 2.10)
- 0.1.1 - contains Accumulo 1.6.5 (2gb configuration) and Spark 1.5.2 (Scala 2.11)
- 0.1.2 - contains Accumulo 1.6.5 (512mb configuration) and Spark 1.5.2 (Scala 2.10)
- 0.1.3 - contains Accumulo 1.6.5 (512mb configuration) and Spark 1.5.2 (Scala 2.11)
- 0.2.0 - contains Accumulo 1.7.0 (2gb configuration) and Spark 1.5.2 (Scala 2.10)
- 0.2.1 - contains Accumulo 1.7.0 (2gb configuration) and Spark 1.5.2 (Scala 2.11)
- 0.2.2 - contains Accumulo 1.7.1 (512mb configuration) and Spark 1.5.2 (Scala 2.10)
- 0.2.3 - contains Accumulo 1.7.1 (512mb configuration) and Spark 1.5.2 (Scala 2.11)
- latest - contains Accumulo 1.7.1 (512mb configuration) and Spark 1.5.2 (Scala 2.10)
GeoTrellis, GeoMesa, and GeoWave:
- install directory
- Contains scripts to install GeoTrellis, GeoMesa and GeoWave into cluster and to run test examples, to be sure that cluster and library operating correct.
Build a multinode cluster
A more detailed description how to run and to build containers can be found in each image directory.
-
Build serf container
cd serf; ./build.sh
-
Build base container
cd base; ./build.sh
-
Build master and slave containers
cd nodes; ./build.sh
Sart the n-node cluster.
cd nodes; ./start-cluster.sh --nodes=n # n >= 1
Probable issues and solutions
A possible use case, is to have possibility to access cluster outside the GeoDocker Cluster (on a separate machine or on a host machine). The probable issue can happen, trying to run some Accumulo
related jobs where we have to provide a ZooKeeper
node address.
WARN impl.ServerClient: Failed to find an available server
in the list of servers: [master1.gt:9997 (120000), slave1.gt:9997 (120000)]
The cause of the problem, that inside docker cluster used own dns, so the client machine where this error happened has no dns records for master1.gt
hostname. The solution is to provide it manually (as a variant just to add it into the /etc/hosts
file).
License
- Based on a repository: https://github.com/alvinhenrick/hadoop-mutinode
- Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0