Build Your Data Engineering Skills. Learn how to tame the big data beast with the most popular tools assisted by top-notch practitioners
Introduction to HDFS, MapReduce and Spark and their system internals. Help understand the MapReduce framework and exercises to process texts.
- Week 1 Demo Assignment
- Hadoop Streaming assignment 0: Word Count
- Hadoop Streaming assignment 1: Words Rating
- Hadoop Streaming assignment 2: Stop Words
- Spark assignment 1: Pairs
- Reconstructing the path
- Real-World Applications: TF-IDF
Honors Assignments
- Hadoop Streaming assignment 3: Name Count
- Hadoop Streaming assignment 4: Word Groups
- Spark assignment 2: Collocations
- Week 2 Demo Assignment
- Hive Assignment 1. DDL: Create Tables
- Hive Assignment 2. DML: Find Most Popular Tags
- Week 4 Demo Assignment
- Counting number of the mutual friends
- Week 5 Demo Assignment
- Graph based Music Recommender. Task 1
- Graph based Music Recommender. Task 2
- Graph based Music Recommender. Task 3
- Graph based Music Recommender. Task 4
- Week 6 Demo Assignment
- Breadth-first search in Spark SQL
Honors Assignments