Data Science Diagnostics

This repo is designed to compile a short list of typical diagnostics used when building regression and prediction models. The list is not exhaustive nor should every diagnostic be used.

Each diagnostic file is a short synopsis of each type of diagnostic to include:

  1. A brief definition
  2. The relevant equations that lead to the diagnostic
  3. A plot (if applicable) of a diagnostic sample
  4. Interpretation and/or use of the diagnostic when appropriate
  5. Further avenues after using each diagnostic
  6. Some example R code for using the diagnostic

If you find these useful, feel free to use them. If you discover any errors or have suggestions to improve these file, please share so I can make them better!

Thank you!

Current List

Standardized Residuals
Studentized Residuals
PRESS Statistic
Cook's Distance
VIF
DFBETA
DFFITS

Upcoming List

AIC
BIC
Mallow's Cp

When should each statistic be used?

Assessing the model choices BEFORE regression
Cook's Distance
VIF

Post-Regression Diagnostics
Standardized Residuals
Studentized Residuals
PRESS Statistic
AIC
BIC
Mallow's Cp
DFBETA
DFFITS