Primary LanguageJupyter Notebook

DAT NY 25 Course Repository

Course materials for General Assembly's Data Science course in New York (8/23/15 - 11/12/15).

Course Producer: Daniel Demoray (email: ddemoray@generalassemb.ly)

Instructor: Amy Roberts

EiRs: Tom Hunter & Corey Maher

###Exit Ticket Fill me out at the end of each class!

###Course Description

Foundational course in data science, including machine learning theory, case studies and real-world examples, introduction to various modeling techniques, and other tools to make predictions and decisions about data. Students will gain practical computational experience by running machine learning algorithms and learning how to choose the best and most representative data models to make predictions. Students will be using Python throughout this course.

Completion Requirements

You can always reach out to Daniel by phone or email if you have any inquiries about enrollment, payments, graduation requirements or questions about how to get to know other students.
General Assembly's Part-time courses are pass/fail programs. We have certain requirements in order to be considered a graduate of our programs:

  1. Missing no more than 2 class sessions over the duration the course.
  2. Completing 80% of assigned homework
  3. Completing the final project

Course Schedule

(Advanced topics will be finalized after student goals are defined)

Week Tuesday Thursday
1 8/25: Introduction to Data Science 8/27: Introduction to Python for Data Science
2 9/1: Intro to Machine Learning with KNN HW1 Due 9/3: Regression & Regularization Part 1
3 9/8: Web APIs & Regression Part 2 9/10: Decision Trees for Classification & Regression
4 9/15: Clustering with K-means 9/17: Random Forests &
Project Milestone: [Elevator Pitch]
5 9/22: No Class 9/24: Logistic Regression HW3 Due
6 9/29: ROC Curves, AUC, & Imbalanced Classes 10/1: Databases Technologies
7 10/6: Recommender Systems
HW4 Due
10/8: Naive Bayes
8 10/13: Natural Language Processing
Project Milestone: [First Draft Due]
9 10/20: Dimensionality Reduction 10/22: Ensemble Methods
10 10/27: Final Project Work Session 10/29: Tom Anderson- NLP Case Studies from Odin Text/ Time Series intro
11 11/2: Heat Seek Data Viz and algos for good 11/5: Time Series
12 11/10: Project Presentations Day 1
Project Milestone: Presentation
11/12: Project Presentations Day 2
Project Milestone: Presentation & Paper

syllabus last updated: 10/27/2015

Homework Schedule

Please submit completed homework assignments to the appropriate Google Drive folder.

HW Topics Dataset Assigned Due Feedback
1 Data Exploration titanic 8/27 9/1 9/3
2 KNN & Cross Validation iris 9/3 Optional n/a
FP1 Elevator Pitch N/A 9/8 9/17 9/24
3 Decision Trees bank 9/15 9/24 10/1
4 Logistic Regression, ROC/AUC, & Imbalanced Classes spam 9/29 10/6 10/13
FP2 [First Draft] of Final Project yours 9/29 10/13 10/20
FP3 [Peer Feedback] on Final Project First Draft yours 10/13 10/20 n/a
FP4 [Final Project] yours 9/8 11/10 11/13


Office Hours

instructor times available method
Amy by appointment TBD
Corey Saturdays 10-12 in person on 3rd Floor at GA or by slack
Tom Sundays 2-4 in person on 3rd Floor at GA or by slack

Please use email or Slack to schedule office hours. Use [office hours] in the subject line as it can help us find the emails easier and reply more quickly.


You've all been invited to use Slack for chat during class and the day. Please consider this the primary way to contact other students. The TAs will be in Slack during class to handle questions. All instructors will be available on Slack during office hours (listed above).