Course Information: Spring 2018

  • Graduate Business, Leavey School of Business
  • Course MSIS 2641: Big Data Modeling & Analytics
  • Big-Data-MapReduce Course @ Santa Clara University
  • Class Meeting dates: 04/03/2018 - 06/08/2018
  • Class hours: Class Number 68044: TTh 5:45PM - 7:00PM PST
  • Class hours: Class Number 68041: TTh 7:35PM - 8:50PM PST
  • Class room: Lucas Hall 310
  • Office: 321 T, Lucas Hall

Required Books and Papers


Midterm Exam

  • Midterm Exam: TBDL, @ 5:45pm to 7:00pm PST
  • Midterm Exam: TBDL, @ 7:35pm to 8:50pm PST

Final Exam Class Number 68044

  • Class hours: TTh 5:45PM - 7:00PM PST
  • Final Exam Date: Tuesday, June 12, 2018
  • Final Exam Time: 5:45PM - 7:00PM PST

Final Exam Class Number 68041

  • Class hours: TTh 7:35PM - 8:50PM PST
  • Final Exam Date: Thursday, June 14, 2018
  • Final Exam Time: 5:45PM - 7:00PM PST

Course Description

The main focus of this class is to cover the following concepts:

  • Concepts of Big Data
  • Distributed File Systems
  • Distributed Computing
  • Distributed and Parallel Algorithms
  • MapReduce Paradigm
  • MapReduce Algorithms
  • Scale-out Architectures (using Hadoop, Spark, PySpark)
  • Apache Spark: http://spark.apache.org/
  • Use Spark, Py-Spark, Hadoop, and Java to teach MapReduce and distributed computing
  • SQL for NoSQL Data, How?

My latest book:

Data Algorithms: Recipes for Scaling up with Hadoop and Spark

Data Algorithms Book