/hadoop-tasks

Hadoop tasks repository for Parallel and Distributed Computing course at MIPT 2015

Primary LanguageJava

hadoop-tasks

Hadoop tasks repository for Parallel and Distributed Computing course at MIPT 2015

Contains code of the following tasks:

  • Word Count
  • Inverted Index
  • Matrix Multiplication

Speed-up achieved for Matrix Multiplication

On 4-node Hadoop cluster Matrix Multiplication works for 1.5 min on 500x1000 and 1000x2000 matrices and sequential version of this program, written in Python, works for about 5 min.