/big-data-mapreduce-course

Big Data, MapReduce, Spark, PySpark, Java @ Santa Clara University, SPRING 2017

Primary LanguageHTML

Course Information

Exam Dates

  • Midterm Exam: October 2017 (possibly end of October), from 5:45pm to 7:00pm PST
  • Final Exam: December 4-7, 2017 from 5:45pm-7:45pm PST

Course Description

The main focus of this class is to cover the following concepts:

  • Concepts of Big Data
  • Distributed File Systems
  • Distributed Computing
  • Distributed and Parallel Algorithms
  • MapReduce Paradigm
  • MapReduce Algorithms
  • Scale-out Architectures (using Hadoop, Spark, PySpark)
  • Apache Spark: http://spark.apache.org/
  • Use Spark, Py-Spark, Hadoop, and Java to teach MapReduce and distributed computing
  • SQL for NoSQL Data, How?

My latest book:

Data Algorithms: Recipes for Scaling up with Hadoop and Spark

Data Algorithms Book