First of all you have to install docker-hadoop following this guide:
https://clubhouse.io/developer-how-to/how-to-set-up-a-hadoop-cluster-in-docker/
If you want to configure your environment, you have to follow these steps:
-
Stop the docker containers:
docker stop historyserver nodemanager1 resourcemanager datanode1 namenode
-
Go to your local big-data-europe/docker-hadoop and modify the
docker-compose.yml
file adding this two lines under namenode:namenode: ... - <YOUR-LOCAL-PATH>:/home/shared ...
In you can insert the local directory that contains your hadoop examples.
An example of
docker-compose.yml
file is inside ./Examples. -
Remove the docker containers:
docker rm historyserver nodemanager1 resourcemanager datanode1 namenode
-
Run this command:
docker-compose up -d
-
Now you can use your examples with docker!
Use this command:
docker exec -it namenode bash
If you are on Ubuntu or you use WSL, you can create an alias of this command adding this line to the file ~/.bashrc:
alias docker-hadoop='docker exec -it namenode bash'
start-docker-hadoop.sh
: to start docker hadoop containers;stop-docker-hadoop.sh
: to start docker hadoop containers.
To use them everywhere follow these steps:
-
Create hadoop-scripts directory in /usr/local:
sudo mkdir /usr/local/hadoop-scripts
-
Move the Scripts content in /usr/local/hadoop-scripts:
sudo cp ./Scripts/* /usr/local/hadoop-scripts
-
Change the owner:
sudo chown -R <your_user>:<your_group> /usr/local/hadoop-scripts
-
Add the execution permission:
sudo chmod a+x /usr/local/hadoop-scripts/*
-
Add these lines at the end of the file ~/.bashrc:
export HADOOP_SCRIPTS_HOME=/usr/local/hadoop-scripts export PATH=$PATH:$HADOOP_SCRIPTS_HOME
Inside ./handoop directory you can find a simple example called WordCount. First of all, you have to install Maven.
Then you can generate the jar file using the following command:
mvn package
Now you can use it!