This exercise consists of 4 parts:
Part 1: Creating a base RDD and pair RDDs
Part 2: Counting with pair RDDs
Part 3: Finding unique words and a mean value
Part 4: Apply word count to a file
This exercise consists of 4 parts:
Part 1: Apache Web Server Log file format
Part 2: Sample Analyses on the Web Server Log File
Part 3: Analyzing Web Server Log File
Part 4: Exploring 404 Response Codes
This exercise consists of 5 parts and quiz questions:
Part 1: ER as Text Similarity - Bags of Words
Part 2: ER as Text Similarity - Weighted Bag-of-Words using Term-Frequency/Inverse-Document-Frequency
Part 3: ER as Text Similarity - Cosine Similarity
Part 4: Scalable ER
Part 5: Analysis (this is part where you will click through and view plots of your work from part 4)
This exercise consists of 3 parts and quiz questions:
Part 1: Basic Recommendations
Part 2: Collaborative Filtering
Part 3: Predictions for Yourself (this is part where you will enter your own ratings and see what movies are recommended for you)