A repository for all things interpretability. The code in this repo was written to support my blog post on this topic.
- Criminal Justice
- Retail
- Computer Vision
- Language
- Trouble with Bias: Talk by Kate Crawford at NeurIPS 2017
- Bias Traps in AI: Panel Discussion at AI Now 2017
- The Mythos of Model Interpretability (2017)
- Survey of Methods for Explaining Black Box Models (2018)
- Challenges for Transparency (2017)
- Manipulating and Measuring Model Interpretability (2018)
- Datasheets for Datasets (2018)
- A Causal Framework for Explaining Predictions of Black Box Sequence-to-Sequence Models
- European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation"
- Consistent feature attribution for tree ensembles (2018)
- A Unified Approach to Interpreting Model Predictions (2017)
- Towards a Rigorous Science of Interpretable ML (2017)
- Why should I trust you? Explaining the predictions of any classifier (2016)
- Anchors: High Precision Model-Agnostic Explanations (2018)
- DeepLIFT: Learning Important Features through Propagating Activation Differences
- Explaining Explanations: An Approach to Evaluating Interpretability of Machine Learning (2018)
- Interpretable Machine Learning for Privacy-Preserving Pervasive Systems (2017)
- How do humans understand explanations from ML Systems? An Evaluation of the Human-Interpretability of Explanation (2018)
- Opportunities in ML for Healthcare (2018)
- Predictive Learning via Rule Ensembles (RuleFit) (2005)
- Ethics and AI
- Building Ethics into AI (2018)
- Deep kNN: Towards Confident, Interpretable and Robust Deep Learning (2018)