Word Relatedness - Assignment 2 in Distributed System Programming course at BGU

The assignment in the course website: https://www.cs.bgu.ac.il/~dsp162/Assignments/Assignment_2

Example application

The app is separated to 2 modules: the local modeule and the mapreduce module. They have a mutual parent. In order to run the app (currently a wordcount app):

  • Install the maven project (mvn install)
  • Locate the jar that was created by maven
  • Run hadoop: hadoop jar <jar-path> <wordcount-hdfs-input-path> <wordcount-hdfs-output-path> For example: hadoop jar /Users/dsp-assignment-2/dsp-assignment-2-mapreduce/target/dsp-assignment-2-mapreduce-1.0-SNAPSHOT-job.jar wordcount/input wordcount/output

We are currentyly based on hadoop official tutorial and a tikal tutorial for maven + hadoop.

Unit tests