GMM-on-Hadoop: A Java repository from mayyuen318

This Eclipse project contains Java implementations of the following algorithms:

The folder matlab/ contains the scripts and functions to generate the example multi-dim data.

The root directory also contains shell scripts for running the parallel version of EM, global mean, and word count on a Hadoop cluster.

To run the sequential version of EM and global mean, read the header of MapReduce/sequential/gmm/GMM.java MapReduce/sequential/gmm/OneMean.java

M.W. Mak March 2015

Steps for downloading the repo

Steps for Create a new repo on Github

cd ~/so/java/hadoop/Workspace/MapReduce

git init

git config --global user.name "enmwmak"

git config --global user.email "enmwmak@polyu.edu.hk"

git add .

git commit -m "first commit"

git push -u origin master

mayyuen318/GMM-on-Hadoop