/stat-ml-edu

Resources for education in statistics and machine learning: from advanced undergraduate to research level

Statistics and machine learning: from undergraduate to research

by Edgar Dobriban, Associate Prof. of Statistics & Data Science, Wharton; w/ Secondary Appointment in Computer and Information Science, Univ. of Pennsylvania

  • This repository contains links to references (books, courses, etc) that are useful for learning statistics and machine learning (as well as some neighboring topics). References for background materials such as linear algebra, calculus/analysis/measure theory, probability theory, etc, are usually not included.

  • The level of the references starts from advanced undergraduate stats/math/CS and in some cases goes up to the research level. The books are often standard references and textbooks, used at leading institutions. In particular, several of the books are used in the standard curriculum of the PhD program in Statistics at Stanford University (where I learned from them as well), as well as at the University of Pennsylvania (where I work). The goal is to benefit students, researchers seeking to enter new areas, and lifelong learners.

  • For each topic, materials are listed in a rough order of from basic to advanced.

  • The list is highly subjective and incomplete, reflecting my own preferences, interests and biases. For instance, there is an emphasis on theoretical material. Most of the references included here are something that I have at least partially (and sometimes extensively) studied; and found helpful. Others are on my to-read list. Several topics are omitted due to lack of expertise (e.g., causal inference, Bayesian statistics, time series, sequential decision-making, functional data analysis, biostatistics, ...).

  • The links are to freely available author copies if those are available, or to online marketplaces otherwise (you are encouraged to search for the best price).

  • How to use these materials to learn: To be an efficient researcher, certain core material must be mastered. However, there is too much specialized knowledge, and it can be overwhelming to know it all. Fortunately, it is often enough to know what type of results/methods/tools are available, and where to find them. When they are needed, they can be recalled and used.

  • Please feel free to contact me with suggestions.

Statistics

Principles and overview

  • Casella & Berger: Statistical Inference (2nd Edition) - Possibly the best introduction to the principles of statistical inference at an advanced undergraduate level. Mathematically rigorous but not technical. Covers key ideas and tools for constructing and evaluating estimators:
    • Data reduction (sufficiency, likelihood principle),
    • Methods for finding estimators (method of moments, Maximum likelihood estimation, Bayes estimators), methods for evaluating estimators (mean squared error, bias and variance, best unbiased estimators, loss function optimality),
    • Hypothesis testing (likelihood ratio tests, power), confidence intervals (pivotal quantities, coverage),
    • Asymptotics (consistency, efficiency, bootstrap, robustness).
  • Wasserman: All of Statistics: A Concise Course in Statistical Inference - A panoramic overview of statistics; mathematical but proofs are omitted. Covers material overlapping with ESL, TSH, TPE (abbreviations defined below), and other books in this list.
  • Cox: Principles of Statistical Inference - Covers a number of classical principles and ideas such as pivotal inference, ancillarity, conditioning, including famous paradoxes. Light on math, but containing deep thoughts.

Statistical Methodology

Statistical Theory

Core Theory: First Year PhD Curriculum

Advanced Theory

This section is the most detailed one, as it is the closest to my research.

Non-parametrics, minimax lower bounds

  • Tsybakov: Introduction to Nonparametric Estimation - The first two chapters contain many core results and techniques in nonparametric estimation, including lower bounds (Le Cam, Fano, Assouad).
  • Weissman, Ozgur, Han: Stanford EE 378 Course Materials. Lecture Notes - Possibly the most comprehensive set of materials on information theoretic lower bounds, including estimation and testing (Ingster's method) with examples given in high-dimensional problems, optimization, etc.
  • Johnstone: Gaussian estimation: Sequence and wavelet models - Beautiful overview of estimation in Gaussian noise (shrinkage, wavelet thresholding, optimality). Rigorous and deep, has challenging exercises.

Overviews of statistical machine learning theory

Semiparametrics

Multivariate statistical analysis

Subsampling

Empirical processes

High dimensional (mean field, proportional limit) asymptotics; random matrix theory (RMT) for stats+ML

Applications and case studies

Machine Learning

ML Theory

Deep Learning

DL Practice and Conceptual Understanding

Safe AI

DL Theory

This is subject to active development and research. There is no complete reference.

Language Models

Uncertainty quantification

Complements

Optimization

Probability

Concentration inequalities

Chaining

  • Talagrand: Upper and Lower Bounds for Stochastic Processes - Chaining is a theoretical tool invented by Talagrand, and can often give optimal bounds of the tail behavior of stochastic processes (even when standard concentration inequalities fail to do so). This is the a readable, but rigorous and complete reference by the inventor of the theory.