DAT_SF_14

Course materials for General Assembly's Data Science course in San Francisco (4/28/15 - 7/9/15).

Instructor:

Experts-in-Residence:

  • Alex Chao (Office Hours: 2-6 pm Sundays)

  • David Feldman (Office Hours: 4-6:30 pm Tuesdays,Thursdays)


Course Schedule (Tentative)

Week Tuesday Thursday
1 4/28: Introduction to Data Exploration 4/30: Introduction to Machine Learning
2 5/05: Data Format, Access & Transformation 5/07: K-Nearest Neighbors Classification
Final Project Kickoff
3 5/12: Naive Bayes Classification
HW1 Due
5/14: Regression and Regularization
4 5/19: Logistic Regression
HW2 Due
5/21: K-Means Clustering
Project Milestone (PM1): Elevator Pitch
5 5/26: Clustering & Decision Trees 5/28: Tree-based Classifiers
6 6/02: Ensemble Techniques
Project Milestone (PM2): Data Ready
6/04: Support Vector Machines
7 6/09: Dimensionality Reduction
HW3 Due
6/11: Imbalanced Classes and Evaluation Metrics
8 6/16: Recommendation Systems
Project Milestone (PM3): First Draft Due
6/18: Natural Language Processing
9 6/23: Final Project Work Session & Peer Feedback
Project Milestone (PM4): Peer Feedback Due
6/25: Map-Reduce & Hadoop
10 6/30: Distributed Data Processing (Spark) 7/02: Network Analysis
11 7/07: Project Presentations Day 1
Project Milestone (PM5): Presentation
7/09: Project Presentations Day 2
Project Milestone (PM5): Presentation & Paper

Homework Schedule

HW Topics Dataset Assigned Due
1 Data Exploration titanic 5/05 5/12
2 KNN & Cross Validation iris 5/07 5/19
PM1 Elevator Pitch project - 5/21
PM2 Data Ready project - 6/02
3 Decision Trees bank 6/02 6/09
PM3 First Draft Due project - 6/16 (Before Class)

Resources

Working in the terminal

Statistical Learning Theory

Algorithms

Python