Awesome Interpretable Machine Learning

Opinionated list of resources facilitating model interpretability (introspection, simplification, visualization, explanation).

Interpretable Models

Interpretable models
- Simple decision trees
- Rules
- (Regularized) linear regression
- k-NN
Interpretable Classifiers Using Rules and Bayesian Analysis: Building a Better Stroke Prediction Model https://arxiv.org/pdf/1511.01644.pdf
Comprehensible Classification Models – a position paper by Alex A. Freitas
- Interesting discussion of interpretability for a few classification models (decision trees, classification rules, decision tables, nearest neighbors and Bayesian network classifier)
- http://www.kdd.org/exploration_files/V15-01-01-Freitas.pdf

Feature Importance

Models offering feature importance measures
- Random forest
- Boosted trees
- Extremely randomized trees https://doi.org/10.1007/s10994-006-6226-1
- Linear regression (with a grain of salt)
Model Class Reliance: Variable Importance Measures for any Machine Learning Model Class, from the “Rashomon” Perspective
- Universal (model agnostic) variable importance measure
- https://arxiv.org/pdf/1801.01489
- https://github.com/aaronjfisher/mcr
Bias in random forest variable importance measures: Illustrations, sources and a solution https://doi.org/10.1186/1471-2105-8-25
Conditional variable importance for random forests https://doi.org/10.1186/1471-2105-9-307
Very good blog post describing deficiencies of random forest feature importance and the permutation importance
- http://explained.ai/rf-importance/index.html
Permutation importance - simple model agnostic approach is described in Eli5 documentation
- https://eli5.readthedocs.io/en/latest/blackbox/permutation_importance.html

Feature Selection

Classification of feature selection methods
- Filters
- Wrappers
- Embedded methods
Feature Engineering and Selection by Kuhn & Johnson
- Sligtly off-topic, but very interesting book
- http://www.feat.engineering/index.html
- https://bookdown.org/max/FES/
- https://github.com/topepo/FES

Model Explanations

Philosophy

Magnets by R. P. Feynman https://www.youtube.com/watch?v=wMFPe-DwULM
The Mythos of Model Interpretability
- https://arxiv.org/pdf/1606.03490.pdf
- https://www.youtube.com/watch?v=mvzBQci04qA
The Promise and Peril of Human Evaluation for Model Interpretability
- https://arxiv.org/pdf/1711.07414.pdf
Towards Rigorous Science of Model Interpretability https://arxiv.org/pdf/1702.08608
The Book of Why: The New Science of Cause and Effect by Judea Pearl
Looking Inside the Black Box, presentation of Leo Breiman
- https://www.stat.berkeley.edu/users/breiman/wald2002-2.pdf

Model Agnostic Explanations

LIME (Local Interpretable Model-agnostic Explanations)
SHAP (SHapley Additive exPlanations), generalizing LIME
- https://arxiv.org/pdf/1705.07874.pdf
- Code: https://github.com/slundberg/shap
Anchors: High-Precision Model-Agnostic Explanations, another improvement over LIME
- https://homes.cs.washington.edu/~marcotcr/aaai18.pdf
- Code: https://github.com/marcotcr/anchor-experiments
Explanations of Model Predictions with live and breakDown Package
- https://arxiv.org/pdf/1804.01955.pdf
- Docs: https://mi2datalab.github.io/live/
- Code: https://github.com/MI2DataLab/live
- Docs: https://pbiecek.github.io/breakDown
- Code: https://github.com/pbiecek/breakDown
Model Explanation System by Ryan Turner
- http://www.blackboxworkshop.org/pdf/Turner2015_MES.pdf
- https://arxiv.org/pdf/1606.09517.pdf
Understanding Black-box Predictions via Influence Functions
- https://arxiv.org/pdf/1703.04730.pdf
A review book - Interpretable Machine Learning. A Guide for Making Black Box Models Explainable by Christoph Molnar
https://christophm.github.io/interpretable-ml-book/

Model Specific Explanations - Neural Networks

Visualizing and Understanding Convolutional Networks
- https://arxiv.org/pdf/1311.2901.pdf
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
- https://arxiv.org/pdf/1312.6034.pdf
Understanding Neural Networks Through Deep Visualization
- https://arxiv.org/pdf/1506.06579.pdf
- https://github.com/yosinski/deep-visualization-toolbox
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
- https://arxiv.org/pdf/1610.02391
Generating Visual Explanations
- https://arxiv.org/pdf/1603.08507.pdf
Rationalizng Neural Network Predictions
Pixel entropy can be used to detect relevant picture regions (for CovNets)
- See Visualization section and Fig. 5 of the paper
  High-Resolution Breast Cancer Screening with Multi-View Deep Convolutional Neural Networks
  
  https://arxiv.org/pdf/1703.07047.pdf
SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability
- https://arxiv.org/pdf/1706.05806.pdf
- https://research.googleblog.com/2017/11/interpreting-deep-neural-networks-with.html
Visual Explanation by Interpretation: Improving Visual Feedback Capabilities of Deep Neural Networks
- https://arxiv.org/pdf/1712.06302.pdf
Axiomatic Attribution for Deep Networks
- Proposes Integrated Gradients Method
- https://arxiv.org/pdf/1703.01365.pdf
- Code: https://github.com/ankurtaly/Integrated-Gradients
- See also: Gradients of Counterfactuals https://arxiv.org/pdf/1611.02639.pdf
Learning Important Features Through Propagating Activation Differences
- Proposes Deep Lift method
- https://arxiv.org/pdf/1704.02685.pdf
- Code: https://github.com/kundajelab/deeplift
- Videos: https://www.youtube.com/playlist?list=PLJLjQOkqSRTP3cLB2cOOi_bQFw6KPGKML
The (Un)reliability of saliency methods
- Review of failures for methods extracting most important pixels for prediction
- https://arxiv.org/pdf/1711.00867.pdf
Classifier-agnostic Saliency Map Extraction
- https://arxiv.org/pdf/1805.08249.pdf
- Code: https://github.com/kondiz/casme
The Building Blocks of Interpretability
- https://distill.pub/2018/building-blocks
- Has some embeded links to notebooks
- Uses Lucid library https://github.com/tensorflow/lucid

Extracting Interpretable Models From Complex Ones

Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples
- https://arxiv.org/pdf/1711.09576.pdf
Distilling a Neural Network Into a Soft Decision Tree
- https://arxiv.org/pdf/1711.09784.pdf

Model Visualization

Visualizing Statistical Models: Removing the blindfold
- http://had.co.nz/stat645/model-vis.pdf
Partial dependence plots
- http://scikit-learn.org/stable/auto_examples/ensemble/plot_partial_dependence.html
- pdp: An R Package for Constructing Partial Dependence Plots https://journal.r-project.org/archive/2017/RJ-2017-016/RJ-2017-016.pdf https://cran.r-project.org/web/packages/pdp/index.html
ggfortify: Unified Interface to Visualize Statistical Results of Popular R Packages
- https://journal.r-project.org/archive/2016-2/tang-horikoshi-li.pdf
- CRAN https://cran.r-project.org/web/packages/ggfortify/index.html
RandomForestExplainer
- Master thesis https://rawgit.com/geneticsMiNIng/BlackBoxOpener/master/randomForestExplainer_Master_thesis.pdf
- R code
  - CRAN https://cran.r-project.org/web/packages/randomForestExplainer/index.html
  - Code: https://github.com/MI2DataLab/randomForestExplainer
ggRandomForest
- Paper (vignette) https://github.com/ehrlinger/ggRandomForests/raw/master/vignettes/randomForestSRC-Survival.pdf
- R code
  - CRAN https://cran.r-project.org/web/packages/ggRandomForests/index.html
  - Code: https://github.com/ehrlinger/ggRandomForests

Selected Review Talks and Tutorials

Tutorial on Interpretable machine learning at ICML 2017
- Slides: http://people.csail.mit.edu/beenkim/papers/BeenK_FinaleDV_ICML2017_tutorial.pdf
P. Biecek, Show Me Your Model tools for visualisation of statistical models
- Video: https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference/Show-Me-Your-Model-tools-for-visualisation-of-statistical-models
S. Ritchie, Just-So Stories of AI
- Video: https://www.youtube.com/watch?v=DiWkKqZChF0
- Slides: https://speakerdeck.com/sritchie/just-so-stories-for-ai-explaining-black-box-predictions
C. Jarmul, Towards Interpretable Accountable Models
- Video: https://www.youtube.com/watch?v=B3PtcF-6Dtc
- Slides: https://docs.google.com/presentation/d/e/2PACX-1vR05kpagAbL5qo1QThxwu44TI5SQAws_UFVg3nUAmKp39uNG0xdBjcMA-VyEeqZRGGQtt0CS5h2DMTS/embed?start=false&loop=false&delayms=3000
I. Oszvald, Machine Learning Libraries You’d Wish You’d Known About
- A large part of the talk covers model explanation and visualization
- Video: https://www.youtube.com/watch?v=nDF7_8FOhpI
- Associated notebook on explaining regression predictions: https://github.com/ianozsvald/data_science_delivered/blob/master/ml_explain_regression_prediction.ipynb
G. Varoquaux, Understanding and diagnosing your machine-learning models (covers PDP and Lime among others)
- http://gael-varoquaux.info/interpreting_ml_tuto/

Venues

Interpretable ML Symposium (NIPS 2017) (contains links to papers, slides and videos)
- http://interpretable.ml/
- Debate, Interpretability is necessary in machine learning
  - https://www.youtube.com/watch?v=2hW05ZfsUUo
2017 Workshop on Human Interpretability in Machine Learning (WHI) (in conjunction with ICML 2017) (contains links to papers and slides)
- https://sites.google.com/view/whi2017/home

Software

Software related to papers is mentioned along with each publication. Here only standalone software is included.

ELI5 - Python package dedicated to debugging machine learning classifiers and explaining their predictions
- Code: https://github.com/TeamHG-Memex/eli5
- https://eli5.readthedocs.io/en/latest/
yellowbrick - visual analysis and diagnostic tools to facilitate machine learning model selection
- Code: https://github.com/DistrictDataLabs/yellowbrick
- http://www.scikit-yb.org/en/latest/
lime - R package implementing LIME
- https://github.com/thomasp85/lime
forestmodel - R package visualizing coefficients of different models with the so called forest plot
- CRAN https://cran.r-project.org/web/packages/forestmodel/index.html
- Code: https://github.com/NikNakk/forestmodel
DALEX - Descriptive mAchine Learning EXplanations
- CRAN https://cran.r-project.org/web/packages/DALEX/DALEX.pdf
- Code: https://github.com/pbiecek/DALEX
Lucid - a collection of infrastructure and tools for research in neural network interpretability
- Code: https://github.com/tensorflow/lucid

aptxj/awesome-interpretable-machine-learning