/bigdata_fundamentals_S4

Set of practical activities of Big data fundamentals (Hadoop, HDFS, Spark ...).

Primary LanguageJavaMIT LicenseMIT

bigdata_fundamentals_S4

This repository is a set of my practical activities (awesome code) during my class at ENSET-M S4 II-BDCC 2.

Topics

  1. Interacting with HDFS using command line. Source Code, Report.

  2. Interacting with HDFS using Java. Source Code, Report.

  3. MapReduce Job to find the total sales by city. Source Code, Report

  4. MapReduce Job to find the total sales by city in a specific date. Source Code, Report

  5. MapReduce Job to find the min/max salary by department. Source Code, Report

  6. MapReduce Job to find the number of employees by department. Source Code, Report

  7. MapReduce Job to find min/max temperature by month (1916). Source Code, Report

  8. MapReduce Job to find min/max temperature by year (1908-1916, multiple input files). [Not Ready]

  9. K-means implementation with Java. Source Code

  10. K-means implementation using MapReduce (Points clasturing). Source Code, TP, Report

  11. K-means implementation using MapReduce (Image Processing). Source Code, TP[Report]

Spark

  1. Three Spark tasks (WordCount, sells total by city and sells total by date_city). Source Code, TP

  2. Analyser les données météorologiques fournies par NCEI (National Centers for Environmental Information) à l'aide de Spark. Source Code, TP

  3. Initializing in SparkSQL TP (Dataframes & datasets using Java). Source Code, DETAILS