DS 2.1 Machine Learning

Course Description

Students will learn the foundational concepts and techniques of machine learning and how to apply those techniques to data science. Principles of data science and machine learning will be examined and applied to problem solving. Students will master data science processes and its applications, including how to wrangle and use data to train classification or prediction models. To demonstrate mastery, students will apply these techniques to develop and train models on data sets using industry-standard modern software libraries and tools. Students will develop “sharp” data science questions, select a data set and apply a variety of methods to explore those questions and find relevant answers.

Why you should know this?

Machine learning has shaped our world. Machine learning is the science of getting computers to learn and act like humans do, and improve their learning over time in autonomous fashion, by feeding them data and information in the form of observations and real-world interactions.

Course Specifics

Weeks to Completion: 7
Total Seat Hours: 37.5 hours
Total Out-of-Class Hours: 75 hours
Total Hours: 112.5 hours
Units: 3 units
Delivery Method: Residential
Class Sessions: 14 classes, 7 labs

Prerequisites:

Learning Outcomes

By the end of the course the students will be able to

  1. Identify a prediction problem and choose the appropriate regression model
  2. Identify a classification problem and choose the appropriate classifier
  3. Evaluate either a regression model or a classifier
  4. Cluster un-labeled datasets to groups
  5. Compare models and choose the best model for the task, while continuing to tune the model's hyper-parameters

Schedule

Course Dates: Monday, March 30 – Wednesday, May 13, 20266701 (7 weeks)

Class Times: 2:30pm to 5:15pm on Monday, Wednesday (13 class sessions)

Class Date Topics
1 Mon, March 30 Introduction to Machine Learning
2 Wed, April 1 Linear Regression
- Mon, April 6 Support Vector Machine
3 Wed, April 8 Logistic Regression
4 Mon, April 13 Model Evaluation
5 Wed, April 15 Decision Tree
6 Mon, April 20 Principal Component Analysis
7 Wed, April 22 Clustering
8 Mon, April 27 Review Session
9 Wed, April 29 Naive Bayes
10 Mon, May 4 TFIDF and its Application
11 Wed, May 6 Ensemble Methods
12 Mon, May 11 Final Exam
13 Wed, May 13 Presentations

Assignment Schedule

[INSTRUCTOR NOTE] REPLACE THE BELOW WITH LINKS TO YOUR ASSIGNMENTS, CORRECT DATES, AND SUBMISSION FORM

Assignment Date Assigned Due Date Submission Form
Homework 1 - Linear Regression for Boston Housing Dataset Wed, April 1 Wed, April 8 Submit Assignment
Homework 2 - SVM for Breast Cancer Dataset Wed, April 8 Wed, April 15 Submit Assignment
Homework 3 - PCA and K-Means Clustering on wholesale customers dataset Wed, April 22 Wed, April 29 Submit Assignment

Class Assignments

  • Apply Linear Regression for Boston Housing Dataset
  • Apply SVM for Breast Cancer Dataset
  • Apply PCA and K-Means Clustering on wholesale customers dataset
  • Projects should be linked to a project page which has a description & requirements.

Tutorials

Projects

  • You will choose your own dataset to clean, investigate, and make predictions or classification or clustering on it

All projects will require a minimum of 10 commits, and must take place throughout the entirety of the course

  • Good Example: 40+ commits throughout the length of the course, looking for a healthy spattering of commits each week (such as 3-5 per day).

  • Bad Example: 10 commits on one day during the course and no others. Students who do this will be at severe risk of not passing the class.

  • Unacceptable Example: 2 commits the day before a project is due. Students who do this should not expect to pass the class.

  • The Final Project Guideline for DS 2.1

  • The Rubric for Final Project

Why are we doing this?

We want to encourage best practices that you will see working as a professional software engineer. Breaking up a project by doing a large amount of commits helps engineers in the following ways:

  • It's much easier to retrace your steps if you break your project/product/code up into smaller pieces
  • It helps with being able to comprehend the larger problem, and also will help with your debugging (i.e. finding exactly when you pushed that piece of broken code)
  • It allows for more streamlined, iterative communication in your team, as it's much easier to hand off a small change to someone (updating a function) than a huge one (changed the architecture of the project)

Through this requirement, we hope to encourage you to think about projects with an iterative, modular mindset. Doing so will allow you to break projects down into smaller milestones that come together to make your fully-realized solution.

Final Exam

  • Passing the exam is a requirement for passing the class.

  • You will have 2 hours to complete this exam - it will be in class using paper and pencil, or a format of the instructor's choosing

  • There are no retakes of the exam.

  • If you have a disability that needs an accommodation such as extended time or a different format, please take advantage of our accommodations program.

  • Study Guide

Evaluation

To pass this course you must meet the following requirements:

  • Should participate for all in-class activities
  • Accomplish all the homework
  • Finish the course tutorial(s)
  • Pass Final Exam >= 75% and Quizzes
  • Do final project and present it
  • Actively participate in class and abide by the attendance policy
  • Attend all Programming Labs
  • Make up all classwork from all absences

Attendance

Just like any job, attendance at Make School is required and a key component of your success. Attendance is being onsite from 9:30 to 5:30 each day, attending all scheduled sessions: classes, huddles, etc. and working in the study labs when not in a scheduled session. Working onsite allows you to learn with your peers, have access to support from TAs, instructors and others, and is vital to your learning.

Attendance requirements for scheduled sessions are:

  • No more than two no call no shows per term in any scheduled session.
  • No more than four excused absences per term in any scheduled session.

Failure to meet these requirements will result in a PIP (Participation Improvement Plan). Failure to improve after the PIP will result in not being invited back next term.

Make School Course Policies

Academic Honesty
Accommodations for Students
Attendance Policy
Diversity and Inclusion Policy
Grading System
Title IX Policy
Program Learning Outcomes