/Yandex_big_data

Курс по работе с большими данными

Primary LanguageJupyter Notebook

Big Data for Data Engineers Specialization

Coursera Speclialization

Build Your Data Engineering Skills. Learn how to tame the big data beast with the most popular tools assisted by top-notch practitioners

Courses

Big Data Essentials HDFS, MapReduce and Spark

Introduction to HDFS, MapReduce and Spark and their system internals. Help understand the MapReduce framework and exercises to process texts.

  1. Week 1 Demo Assignment
  2. Hadoop Streaming assignment 0: Word Count
  3. Hadoop Streaming assignment 1: Words Rating
  4. Hadoop Streaming assignment 2: Stop Words
  5. Spark assignment 1: Pairs
  6. Reconstructing the path
  7. Real-World Applications: TF-IDF

Honors Assignments

  1. Hadoop Streaming assignment 3: Name Count
  2. Hadoop Streaming assignment 4: Word Groups
  3. Spark assignment 2: Collocations

Big Data Analysis Hive, Spark SQL, DataFrames and GraphFrames

  1. Week 2 Demo Assignment
  2. Hive Assignment 1. DDL: Create Tables
  3. Hive Assignment 2. DML: Find Most Popular Tags
  4. Week 4 Demo Assignment
  5. Counting number of the mutual friends
  6. Week 5 Demo Assignment
  7. Graph based Music Recommender. Task 1
  8. Graph based Music Recommender. Task 2
  9. Graph based Music Recommender. Task 3
  10. Graph based Music Recommender. Task 4
  11. Week 6 Demo Assignment
  12. Breadth-first search in Spark SQL

Honors Assignments

  1. Hive Assignment 3. DML: Calculate Amount of Posts per User Age
  2. Graph based Music Recommender. Task 5
  3. Graph based Music Recommender. Task 6