T81 577 Applied Data Science for Practitioners

Washington University in St. Louis

Instructor: Asim Banskota

Spring 2021, Wednesday, 6:00 PM - 9:00 PM , Cupples II, Room L015

Course Description

Organizations are rapidly transforming the way they ingest, integrate, store, serve data, and perform analytics. In this course, students will learn the steps involved with designing and implementing data science projects. Topics addressed include: ingesting and parsing data from various sources, dealing with messy and missing data, transforming and engineering features, building and evaluating machine learning models, and visualizing results. Using Python based tools such as Numpy, Pandas, and Scikit-learn, students will complete a practical data science project that addresses the entire design and implementation process. Students will also become familiar with the best practices and current trends in data science including code documentation, version control, reproducible research, pipeline automation, and cloud computing. Upon completion of the course, students will emerge equipped with data science knowledge and skills that can be applied from day one on the job.

Syllabus

Week Content
Week 1
1/27/2021
Introductions Assignment 1.1: Install anaconda and test Jupyter notebook
Assignment 1.2: AWS fundamentals
Week 2
2/3/2021
Python Fundamentals Assignment 2: Programming practice assignment
Week 3
2/10/2021
Coding Best Practices in Data Science Assignment 3: Version control, project structure, and code documentation
Week 4
2/17/2021
Modeling Overview
Week 5
2/24/2021
Accessing Data Assignment: Finalization of final project topic and data set (Not graded)
Week 6
3/3/2021
Numpy/Pandas for Data Wrangling Assignment 4: Exercise with Numpy and Pandas
Week 7
3/10/2021
Exploratory Data Analysis (EDA) Assignment 5: Vizualization and data summary
Week 8
3/17/2021
Data Preprocessing Assignment 6: Data preprocessing
Week 9
3/24/2021
Algorithms for predictive modelling- Part I
Week 10
4/14/2021
Algorithms for predictive modelling- Part II Assignment 7: Model Fitting and evaluation
Week 11
4/21/2021
Best practices in Machine Learning Assignment: hyperparameter tuning
Week 12
4/28/2021
Deploy a Machine Learning model- Part I
Week13
5/5/2021
Deploy a Machine Learning model- Part II Assignment: Final project due on April 30