/Stanford-StatisticalLearning

Stanford Online course STATSX0001 "Statistical Learning" follows closely the sequence of chapters in the course text "An Introduction to Statistical Learning, with Applications in R" (James, Witten, Hastie, Tibshirani - Springer 2013). Trevor Hastie Professor of Statistics and of Biomedical Data Sciences, Stanford University, and Robert Tibshirani Professor of Biomedical Data Science and Statistics, Stanford University

Primary LanguageHTML

Stanford-Statistical Learning

STATSX0001 "Statistical Learning" course follows closely the sequence of chapters in the course text "An Introduction to Statistical Learning, with Applications in R" (James, Witten, Hastie, Tibshirani - Springer 2013).

About this course

This is an introductory-level course in supervised learning, with a focus on regression and classification methods. The syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines. Some unsupervised learning methods are discussed: principal components and clustering (k-means and hierarchical).

This is not a math-heavy class, so we try and describe the methods without heavy reliance on formulas and complex mathematics. We focus on what we consider to be the important elements of modern data analysis. Computing is done in R. There are lectures devoted to R, giving tutorials from the ground up, and progressing with more detailed sessions that implement the techniques in each chapter.

The lectures cover all the material in An Introduction to Statistical Learning, with Applications in R by James, Witten, Hastie and Tibshirani (Springer, 2013). The pdf for this book is available for free on the book website.

Sections are broken up by chapters. The first two sections will be an overview of Statistical Learning, and will cover the first two chapters of the book. All materials are available now, but the schedule below provides you with a recommendation for how to approach the content.

Week 1: Introduction and Overview of Statistical Learning (Chapters 1-2)

Week 2: Linear Regression (Chapter 3)

Week 3: Classification (Chapter 4)

Week 4: Resampling Methods (Chapter 5)

Week 5: Linear Model Selection and Regularization (Chapter 6)

Week 6: Moving Beyond Linearity (Chapter 7)

Week 7: Tree-based Methods (Chapter 8)

Week 8: Support Vector Machines (Chapter 9)

Week 9: Unsupervised Learning (Chapter 10)