Public repository for STAT406 @ UBC - "Elements of Statistical Learning".
The notes in this repository are released under the "Creative Commons Attribution-ShareAlike 4.0 International" license. See the human-readable version here and the real thing here.
The course outline is available here.
The tentative week-by-week schedule is here.
You can register in the course's PIAZZA page via Canvas.
In order to complete the WebWork quizzes you need to register via Canvas: go to the course Canvas page, click on Assignments, then on WebWork Link, and finally click on Load WebWork Link on a new window. This is a necessary step (don't shoot the messenger!) but you only need to do this once.
This is a list of strongly recommended pre-class reading. [JWHT13] and [HTF09] indicate two of the reference books listed below.
- Week 1 (L1): Review of Linear Regression
- Sections 2.1, 2.1.1, 2.1.2, 2.1.3, 2.2, 2.2.1 from [JWHT13]
- Sections 2.4 and 2.6 from [HTF09].
- Week 2 (L2/3): Goodness of Fit vs Prediction error, Cross Validation
- Sections 5.1, 5.1.1, 5.1.2, 5.1.3 from [JWHT13]
- Sections 7.1, 7.2, 7.3, 7.10 from [HTF09].
- Week 3 (L4/5): Correlated predictors, Feature selection, AIC
- Sections 6.1, 6.1.1, 6.1.2, 6.1.3, 6.2 and 6.2.1 from [JWHT13]
- Sections 7.4, 7.5 from [HTF09].
- Week 4 (L6/MT1): Ridge regression, LASSO, Elastic Net
- Sections 6.2 (complete) from [JWHT13]
- Sections 3.4, 3.8, 3.8.1, 3.8.2 from [HTF09]
- Week 5 (L7/8): Elastic Net, Smoothers (Local regression, Splines)
- Sections 7.1, 7.3, 7.4, 7.5, 7.6 from [JWHT13]
- Week 6 (L9/10): Curse of dimensionality, Regression Trees
- Sections 8.1, 8.1.1, 8.1.3, 8.1.4 from [JWHT13]
- Week 7 (L11/MT2): Bagging
- Sections 8.2, 8.2.1 from [JWHT13]
- Week 8 (L12/13): Classification, LDA, LQA, Logistic Regression
- Section 4.1, 4.2, 4.3, 4.4, 2.2.3 from [JWHT13]
- Week 9 (L14/15): Trees, Ensembles, Bagging
- Sections 8.1.2, 8.2.1 and 8.2.2 from [JWHT13]
- Week 10 (L16/MT3): Random Forests
- Sections 8.2.1 and 8.2.2 from [JWHT13]
- Week 11 (L17/18): Boosting, Neural Networks?
- Sections 8.2.3 from [JWHT13]
- Sections 10.1 - 10.10 (except 10.7), 11.3 - 11.5, 11.7 from [HTF09]
- Week 12 (L19/20): Unsupervised learning, K-means, model-based clustering
- Sections 10.3 from [JWHT13]
- Sections 13.2, 14.3 from [HTF09]
- Week 13 (L21/L22): Hierarchical clustering, Principal Components, Multidimensional Scaling
- Sections 10.2, 10.3 from [JWHT13]
- Sections 8.5, 14.3, 14.5.1, 14.8, 14.9 from [HTF09]
-
[JWHT13]: James, G., Witten, D., Hastie, T. and Tibshirani, R. An Introduction to Statistical Learning. 2013. Springer-Verlag New York
-
[HTF09]: Hastie, T., Tibshirani, R. and Friedman, J. The Elements of Statistical Learning. 2009. Second Edition. Springer-Verlag New York
-
[MASS]: Venables, W.N. and Ripley, B.D. Modern Applied Statistics with S. 2002. Fourth edition, Springer, New York.
- R: This is the software we will use in the course. I will assume that you are familiar with it (in particular, that you know how to write your own functions and loops). If needed, there are plenty of resources on line to learn R.
- RStudio: The IDE (integrated development environment) of choice for R. Not necessary, but helpful.
- Jupyter Notebooks. "The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text."
You can use these to interactively run and play with the lecture notes and the code to reproduce all the examples I use in class. This is not necessary, but may be helpful. There are two options to run notebooks: locally on your own computer or use a remote server:
- Follow the instructions
here to install Jupyter on your laptop. You will also need to follow these instructions to install the
R kernel
for Jupyter. - Alternatively, you can run the notebooks on the syzygy server. There are Julia, Python 2, Python 3, and R kernels available (although we will only use the R one). Sign in with your UBC CWL. Once you are logged in, use this link to clone this repository (STAT406) (including all notebooks) directly onto your syzygy home directory. You
maywill need to do this regularly throughout the Term.
- Follow the instructions
here to install Jupyter on your laptop. You will also need to follow these instructions to install the