- Professor: Daniel Acuna https://acuna.io
- Scribers: Lizhen Liang and Yimin Xiao
- Introduction to Data Science Linear algebra, calculus, statistics; Python, Jupyter notebook
- Python Programming Numpy, Pandas, Matplotlib
- Introduction to Hadoop, MapReduce, and Apache Spark
- Introduction to Spark DataFrames and Spark ML
- A Statistical Perspective on Machine Learning Introduction to probability; maximum likelihood estimation; mean square error estimation; gradient descent
- Assessing Model Accuracy Confusion matrix, bias–variance tradeoff, model selection: training, validating, and testing
- Case 1: Sentiment Analysis of Twitter Supervised learning, logistic regression, regularized logistic regression, elastic net regularization, model interpretation
- Case 2: A recommendation system for courses Unsupervised learning, nearest neighbors, dimensionality reduction (Principal Component Analysis, PCA), clustering (k-means)
- Case 3: Predicting Credit Scores with Bagging and Boosting "wisdom of the crowd", bagging, random forests, gradient boosting, feature importance
- Case 4: Object Recognition with Deep Learning Neural networks, multilayer perceptron, backpropagation for MLP; Computation graph, stochastic and mini-batch gradient descent, loss function, model definition, convolutional and recurrent networks, other topics