DS 2.3: Data Science in Production

Course Description

This course covers the tools and techniques commonly utilized for production machine learning in industry. Students learn how to provide web interfaces for training machine learning or deep learning models with Flask and Docker. Students will deploy models in the cloud through Amazon Web Services (AWS), gather and process data from the web, and display information for consumption in advanced web applications using Plotly and D3.js. Students use PySpark to make querying even the largest data stores manageable.

Why you should know this

In this course you will acquire a set of key skills connecting data science and data modeling with the back end and front end web tools allow you to deploy models to the web. Mastering these skills will make you a more versatile Data Scientist or Data Engineer.

Course Specifics

Course Delivery: Online | 7 weeks | 13 sessions

Course Credits: 3 units | 37.5 Seat Hours | 75 Total Hours

Prerequisites:

Learning Outcomes

Students by the end of the course will be able to:

  1. Implement Advanced Visualizations using a Chart.js/D3.js Frontend, and a Python Backend
  2. Implement a Machine Learning or Deep Learning model on a Web App using Flask and Flask-RESTPlus
  3. Understand containers and be familiar with the Docker ecosystem
  4. Dockerize a Flask Web App containing a Machine Learning or Deep Learning Model and deploy it on Heroku, and also on AWS (Amazon Web Services)
  5. Understand the PySpark ecosystem and work on Big Data using PySpark, H2O and Pandas

Schedule

Course Dates: Thursday, January 21 – Thursday, March 4, 2021 (7 weeks)

Class Times: Tuesday, Thursday at 2:45 pm–5:30 pm (13 class sessions)

Class Date Topics
- Tue, Jan 19 No Class
1 Thu, Jan 21 Introduction and accessing O'Reilly books through the MARINet library service
2 Tue, Jan 26 Full Stack Deep Learning setup for labs
3 Thu Jan 28 Lab 1: Building an Interactive app with Chartist
4 Tue, Feb 2 Lab 2: Model Deployment Using Flask: Digit Recognizer Web App
5 Thu, Feb 4 Lab 3: Nasdaq Stock Prices Visualization with D3
6 Tue, Feb 9 Lab 4: Deploying an ML model to the Web on Heroku, Part 1
7 Thu, Feb 11 Lab 5: Deploying an ML model to the Web on Heroku, Part 2
8 Tue, Feb 16 Lab 6: INtroduction to Docker, Part 1
9 Thu, Feb 18 Lab 7: Introduction to Docker, Part 2
10 Tue, Feb 23 Lab 8: Apache Spark, Part 1
11 Thu, Feb 25 Lab 9: Apache Spark, Part 2
12 Tues, Mar 2 Lab 10: Apache Spark, Part 3 OR SQL, Part 1
13 Thu, Mar 4 Lab 11: Apache Spark, Part 4 OR SQL, Part 2

Class Assignments are on GradeScope

  • HW1: Chartist Flask App
  • HW2: MNIST Digit Recognizer Flask App
  • HW3: Dockerize Machine Learning Model and Deploy on AWS
  • HW4: Apache Spark #1
  • HW5: Apache Spark #2 OR SQL
  • Extra Credit: Nasdaq Stock Prices Visualization App

If you have a disability that needs an accommodation such as extended time or a different format, please take advantage of our accommodations program, by filling out the intake form.

Evaluation

To pass this course you must meet the following requirements:

  • Complete 4 of the 5 homework assignments with a grade of 70% or higher

Information Resources

Additional resources you may need (online books, etc.) can be found in the library linked below:

Make School Course Policies