Spark Projects for the Berkeley Data Science Course
-
Wordcount in Spark - A word counting program to count the words in all of Shakespeare's plays
-
Apache Log File analysis in Spark - Use Spark to explore NASA Apache web server log
-
Entity Resolution - Entity Resolution using TFIDF approaches in Spark.
-
Movie Recommendation using ALS - Predicting Movie ratings using Spark.
-
Linear Regression - Predicting Song Year using Linear regression in Spark.
-
Logistic Regression - Predicting Click Through Rates using Spark. One Hot Encoding, Hashing Explained.
-
PCA - Running the PCA on neuroscience data