/e63-coursework

CSIE-63: Big Data Analytics course tought at Harvard - Fall semester 2017

Primary LanguageHTML

CSCI E-63 Big Data Analytics

Professional Graduate Data Science Coursework - Fall semester 2017

Professor: Zoran B. Djordjević, PhD, Senior Enterprise Architect, NTT Data, Inc.

Description:

The emphasis of this course is on mastering two important big data technologies: Spark 2 and TensorFlow. The focus is on Spark Core, Spark ML (machine learning), and Spark Streaming which allows analysis of data in flight, that is, in near real time. Furthermore the so-called NoSQL storage solutions exemplified by Cassandra are examined. An additional focus lies on memory-resident databases and graph databases (Spark GraphX and Ne4J) and scalable messaging systems like Kafka and Amazon Kinesis.

File Layout

The hw directory structure is as follows:

DIRECTORY DESCRIPTION
. Files such as README and gitignore
./docs/ Different files and presentations
./data/ Folder with all the necessary data
./scripts/ Folder with all the code

Website

You can access all the coursework etc. here.