Normalize the enron data
- Run "make" with "preprocess.cpp" and "topnodes.cpp" in the same directory.
- Run "preproc > enron" with "Email-Enron.txt" in the same directory.
Run FindTriangle on normalized data
- Run "javac -classpath ./hadoop-core-1.2.1.jar -d node_classes ~/Downloads/hadoop-1.2.1/FindTriangle.java"
- Run "jar -cvf ./node.jar -C node_classes/ ."
- Run "bin/hadoop fs -put enron input"
- Run "bin/hadoop jar node.jar org.myorg.FindTriangle input temp out"
- Run "bin/hadoop fs -get out enronout"
Find the 10 top most frequently appearing nodes
- Run "topnodes" with "enronout" in the same directory.
- The 10 top nodes with how many times they appear are written to standard output
Development Environment VirtualBox running Ubuntu 12.04