This repos is for MapReduce source code and how to run it on Docker based on the setup from with optimizations.
There are five major parts of this setup,
- assets : This folder contains binaries for Hadoop and Java. Please download JDK 8.0 binaries and hadoop 3.3.6 binaries and rename them to hadoop-3.3.6.tar.gz and jdk-8u202-linux-x64.tar.gz and put them under folder 'assets' for it to work properly.
- config-files : All configures for Hadoop ${HADOOP_HOME}/etc/hadoop/ are in here.
- gnome-kmer-counting: Mapper and Reducer for gnome kmer exercice using Hadoop Streaming from HW1.
- scripts: scripts for building, running and cleaning images, docker containers for this repo.
- mapred-src: scripts for HW2: MapReduce in Python Streaming and Java.
All assets are available in this folder. Please download the right files for the
Please follow the order in scripts
Note for Java run:
- Install the latest java8 suitable for your OS.
- Follow the tutorial Java for VScode to using Maven Project at
. - Remember to export JAR in VSCODE with the option "without main class".