This is the course homepage for STAT 422/722 for the Spring semester 2017 at The Wharton School of the University of Pennsylvania taught by Professor Adam Kapelner. The syllabus can be found here.
Audio for lectures should be on canvas except for the first lecture (links below).
- Lecture 1 (audio sec A) (audio sec B) slides (view) (download)
- Lecture 2 slides (view) (download)
- Lecture 3 slides (view) (download)
- Lecture 4 slides (view) (download)
- Lecture 5 slides (view) (download)
- Lecture 6 slides (view) (download)
- Homework 1 (download) (view) (due 2/2/17) Solutions (download) (view)
- Optional Homework 2 (download) (view) (due 2/15/17) Solutions (download) (view)
- Optional Homework 3 (download) (view) (due 2/27/17) Solutions (download) (view)
- Project (download) (view) (writeup due 3/3/17 NOON in the Statistics office)
- Forecasting Competition for observations found here (see project, csv file due 2/26/17 5PM on canvas and your historical dataframe as csv or jmp) and an example historical dataset is found here.
- Final 2/28/17 and 3/1/17 in class (last 120 minutes).
List of topics not covered on the final exam:
- Anything about how deep learning works only that it exists and can solve certain types of problems
- Anything about Laplace's demon, determinism, alternative explanations (e.g. quantum theory) for the noise beyond unavailable information
- Calculating likelihoods (but you will need to understand likelihood values given to you and the likelihood ratio test as I will be giving you critiical chi-squared values)
- Sidak and Scheffe corrections to multiple testing (you will only need to know the Bonferroni correction)
- Anything about how the computer numerically maximizes likelihood to find a fit (but you will need to know that this process exists)
- I've decided to leave out equivalence testing (much to my dismay)
- The specifics of D-optimality and I-optimality
- Anything about probit and cloglog link functions
- The Wald test or Score test
- Using the chi-squared with one degree of freedom to product z scores
- All fit metrics of logistic regression except misclassification error, weighted misclassification error and AUC.
- Using expected value of profit matrix
- The interpretation of a single fold's test performance vs. the interpretation of K-fold CV's aggregate oos performance (this was a subtle point)
- K-fold CV for three splits
- Nested K-fold CV for tuning machine learning algorithms
- Backwards and Mixed Stepwise regression (only forward regression is covered)
- Tuning parameters in RF
- Wed 3-4:30PM SHDH 109 Gemma Moran
- Thu 4:30-6PM SHDH 109 Gemma Moran