The goal of this class is to cover the foundations of modern statistics and machine learning methods complementing the data mining focus of IDS 572. In other words, you will get up to speed with the requisite background, as well as the key theoretical underpinnings of modern analytics. We will do so through the lens of statistical machine learning. Lectures will be complemented with hands-on exercises.
- Spring 2019 (has videos!)
- Spring 2018 (has videos!)
- Fall 2017 (has videos!)
- Semester: Spring 2022
- Lectures: Mondays 3.00 PM to 5.30 PM at DH 210
- Staff
- Instructor: Dr. Theja Tulabandhula
- Teaching Assistant: Parshan Pakiman
- Offline communication:
- Instructor Office Hours: Thursdays 2.30 to 3.30 PM at UH 2404
- TA Recitations and/or Office Hours: Mondays 6.30 to 7.30 PM on Zoom
- 01/10: lecture (online only)
- 01/24: lecture
- 01/31: lecture
- 02/06: assignment 1 due
- 02/07: lecture (online only)
- 02/14: lecture
- 02/20: assignment 2 due
- 02/21: lecture
- 02/27: project intermediate report+code due including project plan
- 02/28: lecture slot for separate team meetings with teaching staff on projects (online only)
- 03/06: assignment 3 due
- 03/07: lecture
- 03/14: lecture
- 03/28: lecture
- 03/28: assignment 4 due (New: same day as lecture)
- 04/04: lecture
- 04/11: lecture
- 04/11: assignment 5 due (New: same day as lecture)
- 04/17: project final report+code due
- 04/18: lecture slot for live student project presentations I (online only)
- 04/25: lecture slot for live student project presentations II (online only)
- Textbook I: Elements of Statistical Learning II.
- Textbook II: An Introduction to Statistical Learning with Applications in R.
- Slides
- Business Analytics in the Industry
- Refresher on Probability
- Refresher on Linear Algebra
- Calculus refresher pdf 1
- Calculus refresher pdf 2
- Probability refresher 1 source
- Probability refresher 2
- Linear Algebra refresher 1
- Linear Algebra refresher 2
- Machine Learning Mindmap
- Rules of ML by Martin Zinkevich
- XGBoost, LightGBM and Catboost implementations.
- Instance-based Learning
- k-Nearest Neighbor Decision Trees
- Linear regression, Logistic Regression
- Generalized Linear Models
- Model Selection and Assessment
- Support Vector machines and Duality
- Naïve-Bayes and Linear Discriminant Analysis
- Hidden Markov Models
- Structured Prediction Models
- Statistical Learning Theory
- K-means clustering
- Mixture of Gaussian
- Principal Component Analysis
- Independent Component Analysis
- Canonical Correlation Analysis
You should form groups of (strictly) 4 students for the assignment and project components. Reach out to your classmates early. Here is a spreadsheet to facilitate this.
There will be five assignments, released on Github. These involve reimplementing statistical techniques and understanding their behavior on interesting datasets. Always mention external sources used in your assignment solutions. Submission deadline is BEFORE 11.59 PM on the concerned day. Late submissions will have an automatic 20% penalty per day, and no extensions are available. Use Blackboard for uploads. Because this is a group assignment, a commensurate effort is expected, and each member's contributions needs to be reported in the final submission.
The objective will be to demonstrate mastery over data ingestion, processing, prediction modeling and communication of key results. A suitable documentation of this process along with the complete set of scripts/codes/dataset samples/commands used is to be submitted. See the project page for more detailed instructions. Because this is a group project, a commensurate effort is expected, and each member's contributions needs to be reported in the final submission.
- Assignments: 12% + 12% + 12% + 12% + 12%
- Project: 5% for intermediate report and 30% for the final (more details in the project page linked above)
- Course participation: 5% (includes but is not limited to attendance, interaction with the instructor and the TA, and how well you support your group).
- This is a 4 credit graduate level course offered by the Information and Decision Sciences department at UIC.
- Please see the academic calendar for the semester timeline.
- Students who wish to observe their religious holidays (http://oae.uic.edu/religious-calendar/) should notify the instructor within one week of the first lecture date.
- Please contact the instructor at the earliest, if you require accommodations for access to and/or participation in this course.
- Please refer to the academic integrity guidelines set by the university.