RuochenZhao/Awesome-explainable-AI

A collection of research materials on explainable AI/ML

MIT

Awesome-explainable-AI

This repository contains the frontier research on explainable AI(XAI) which is a hot topic recently. From the figure below we can see the trend of interpretable/explainable AI. The publications on this topic are booming.

The figure below illustrates several use cases of XAI. Here we also divide the publications into serveal categories based on this figure. It is challenging to organise these papers well. Good to hear your voice!

Survey Papers

From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI, ArXiv preprint 2022. Corresponding website with collection of XAI methods

Interpretable machine learning:Fundamental principles and 10 grand challenges, Statist. Survey 2022

Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing, NeurlIPS 2021

Pitfalls of Explainable ML: An Industry Perspective, Arxiv preprint 2021

Benchmarking and Survey of Explanation Methods for Black Box Models, Arxiv preprint 2021

Explainable Machine Learning in Deployment, FAT 2020

The elephant in the interpretability room: Why use attention as explanation when we have saliency methods, EMNLP Workshop 2020

A Survey of the State of Explainable AI for Natural Language Processing, AACL-IJCNLP 2020

Interpretable Machine Learning – A Brief History, State-of-the-Art and Challenges, Communications in Computer and Information Science 2020

A brief survey of visualization methods for deep learning models from the perspective of Explainable AI, Information Visualization 2020

Explaining Explanations in AI, ACM FAT 2019

Machine learning interpretability: A survey on methods and metrics, Electronics, 2019

A Survey on Explainable Artificial Intelligence (XAI): Towards Medical XAI, IEEE TNNLS 2020

Interpretable machine learning: definitions, methods, and applications, Arxiv preprint 2019

Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers, IEEE Transactions on Visualization and Computer Graphics, 2019

Explainable Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI, Information Fusion, 2019

Explanation in artificial intelligence: Insights from the social sciences, Artificial Intelligence 2019

Evaluating Explanation Without Ground Truth in Interpretable Machine Learning, Arxiv preprint 2019

A survey of methods for explaining black box models, ACM Computing Surveys, 2018

Explaining Explanations: An Overview of Interpretability of Machine Learning, IEEE DSAA, 2018

Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI), IEEE Access, 2018

Explainable artificial intelligence: A survey, MIPRO, 2018

The Mythos of Model Interpretability: In machine learning, the concept of interpretability is both important and slippery, ACM Queue 2018

[What Do You See? Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors How Convolutional Neural Networks See the World — A Survey of Convolutional Neural Network Visualization Methods, Mathematical Foundations of Computing 2018

Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models, Arxiv 2017

Towards A Rigorous Science of Interpretable Machine Learning, Arxiv preprint 2017

Explaining Explanation, Part 1: Theoretical Foundations, IEEE Intelligent System 2017

Explaining Explanation, Part 2: Empirical Foundations, IEEE Intelligent System 2017

Explaining Explanation, Part 3: The Causal Landscape, IEEE Intelligent System 2017

Explaining Explanation, Part 4: A Deep Dive on Deep Nets, IEEE Intelligent System 2017

An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data, Ecological Modelling 2004

Review and comparison of methods to study the contribution of variables in artificial neural network models, Ecological Modelling 2003

Books

Explainable Artificial Intelligence (xAI) Approaches and Deep Meta-Learning Models, Advances in Deep Learning Chapter 2020

Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, Springer 2019

Explanation in Artificial Intelligence: Insights from the Social Sciences, 2017 arxiv preprint

Visualizations of Deep Neural Networks in Computer Vision: A Survey, Springer Transparent Data Mining for Big and Small Data 2017

Explanatory Model Analysis Explore, Explain and Examine Predictive Models

Interpretable Machine Learning A Guide for Making Black Box Models Explainable

Limitations of Interpretable Machine Learning Methods

An Introduction to Machine Learning Interpretability An Applied Perspective on Fairness, Accountability, Transparency,and Explainable AI

Open Courses

Interpretability and Explainability in Machine Learning, Harvard University

Papers

We mainly follow the taxonomy in the survey paper and divide the XAI/XML papers into the several branches.

Uncategorized Papers on Model/Instance Explanation

Constraint-Driven Explanations for Black Box ML Models, AAAI 2022

The Utility of Explainable AI in Ad Hoc Human-Machine Teaming, NeurIPS 2021

On Completeness-aware Concept-Based Explanations in Deep Neural Networks, NeurIPS 2020

Local Explanation of Dialogue Response Generation, NeurIPS 2021

Improving Deep Learning Interpretability by Saliency Guided Training, NeurIPS 2021

Explaining Hyperparameter Optimization via Partial Dependence Plots, NeurIPS 2021

Learning Groupwise Explanations for Black-Box Models, IJCAI 2021

On Explaining Random Forests with SAT, IJCAI 2021

What Changed? Interpretable Model Comparison, IJCAI 2021

Towards Probabilistic Sufficient Explanations, IJCAI 2021

On Explainability of Graph Neural Networks via Subgraph Explorations, ICML 2021

Why Attentions May Not Be Interpretable?, KDD 2021

Where and What? Examining Interpretable Disentangled Representations, CVPR 2021

Verifiability and Predictability: Interpreting Utilities of Network Architectures for Point Cloud Processing, CVPR 2021

Have We Learned to Explain?: How Interpretability Methods Can Learn to Encode Predictions in their Interpretations, AISTATS 2021

Does Explainable Artificial Intelligence Improve Human Decision-Making?, AAAI 2021

Incorporating Interpretable Output Constraints in Bayesian Neural Networks, NeuIPS 2020

Towards Interpretable Natural Language Understanding with Explanations as Latent Variables, NeurIPS 2020

Learning Deep Attribution Priors Based On Prior Knowledge, NeurIPS 2020

Understanding Global Feature Contributions through Additive Importance Measures, NeurIPS 2020

Learning identifiable and interpretable latent models of high-dimensional neural activity using pi-VAE, NeurIPS 2020

Generative causal explanations of black-box classifiers, NeurIPS 2020

Learning outside the Black-Box: The pursuit of interpretable models, NeurIPS 2020

Explaining Groups of Points in Low-Dimensional Representations, ICML 2020

Explaining Knowledge Distillation by Quantifying the Knowledge, CVPR 2020

Fanoos: Multi-Resolution, Multi-Strength, Interactive Explanations for Learned Systems, IJCAI 2020

Machine Learning Explainability for External Stakeholders, IJCAI 2020

Py-CIU: A Python Library for Explaining Machine Learning Predictions Using Contextual Importance and Utility, IJCAI 2020

Machine Learning Explainability for External Stakeholders, IJCAI 2020

Interpretable Models for Understanding Immersive Simulations, IJCAI 2020

Towards Automatic Concept-based Explanations, NIPS 2019

Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead, Nature Machine Intelligence 2019

Interpretml: A unified framework for machine learning interpretability, arxiv preprint 2019

All Models are Wrong, but Many are Useful: Learning a Variable’s Importance by Studying an Entire Class of Prediction Models Simultaneously, JMLR 2019

On the Robustness of Interpretability Methods, ICML 2018 workshop

Towards A Rigorous Science of Interpretable Machine Learning, Arxiv preprint 2017

Object Region Mining With Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach, CVPR 2017

LOCO, Distribution-Free Predictive Inference For Regression, Arxiv preprint 2016

Explaining data-driven document classifications, MIS Quarterly 2014

Evaluation methods

When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data, ACL 2022

From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI, ArXiv preprint 2022. Corresponding website with collection of XAI methods

What Do You See? Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors, KDD 2021

Evaluations and Methods for Explanation through Robustness Analysis, arxiv preprint 2020

Evaluating and Aggregating Feature-based Model Explanations, IJCAI 2020

Sanity Checks for Saliency Metrics, AAAI 2020

A benchmark for interpretability methods in deep neural networks, NIPS 2019

Methods for interpreting and understanding deep neural networks, Digital Signal Processing 2017

Evaluating the visualization of what a Deep Neural Network has learned, IEEE Transactions on Neural Networks and Learning Systems 2015

Python Libraries(sort in alphabeta order)

AIF360: https://github.com/Trusted-AI/AIF360,

AIX360: https://github.com/IBM/AIX360,

Anchor: https://github.com/marcotcr/anchor, scikit-learn

Alibi: https://github.com/SeldonIO/alibi

Alibi-detect: https://github.com/SeldonIO/alibi-detect

BlackBoxAuditing: https://github.com/algofairness/BlackBoxAuditing, scikit-learn

Brain2020: https://github.com/vkola-lab/brain2020, Pytorch, 3D Brain MRI

Boruta-Shap: https://github.com/Ekeany/Boruta-Shap, scikit-learn

casme: https://github.com/kondiz/casme, Pytorch

Captum: https://github.com/pytorch/captum, Pytorch,

cnn-exposed: https://github.com/idealo/cnn-exposed, Tensorflow

ClusterShapley: https://github.com/wilsonjr/ClusterShapley, Sklearn

DALEX: https://github.com/ModelOriented/DALEX,

Deeplift: https://github.com/kundajelab/deeplift, Tensorflow, Keras

DeepExplain: https://github.com/marcoancona/DeepExplain, Tensorflow, Keras

Deep Visualization Toolbox: https://github.com/yosinski/deep-visualization-toolbox, Caffe,

Eli5: https://github.com/TeamHG-Memex/eli5, Scikit-learn, Keras, xgboost, lightGBM, catboost etc.

explainx: https://github.com/explainX/explainx, xgboost, catboost

ExplainaBoard: https://github.com/neulab/ExplainaBoard,

ExKMC: https://github.com/navefr/ExKMC, Python,

Facet: https://github.com/BCG-Gamma/facet, sklearn,

Grad-cam-Tensorflow: https://github.com/insikk/Grad-CAM-tensorflow, Tensorflow

GRACE: https://github.com/lethaiq/GRACE_KDD20, Pytorch

Innvestigate: https://github.com/albermax/innvestigate, tensorflow, theano, cntk, Keras

imodels: https://github.com/csinva/imodels,

InterpretML: https://github.com/interpretml/interpret

interpret-community: https://github.com/interpretml/interpret-community

Integrated-Gradients: https://github.com/ankurtaly/Integrated-Gradients, Tensorflow

Keras-grad-cam: https://github.com/jacobgil/keras-grad-cam, Keras

Keras-vis: https://github.com/raghakot/keras-vis, Keras

keract: https://github.com/philipperemy/keract, Keras

Lucid: https://github.com/tensorflow/lucid, Tensorflow

LIT: https://github.com/PAIR-code/lit, Tensorflow, specified for NLP Task

Lime: https://github.com/marcotcr/lime, Nearly all platform on Python

LOFO: https://github.com/aerdem4/lofo-importance, scikit-learn

modelStudio: https://github.com/ModelOriented/modelStudio, Keras, Tensorflow, xgboost, lightgbm, h2o

M3d-Cam: https://github.com/MECLabTUDA/M3d-Cam, PyTorch,

NeuroX: https://github.com/fdalvi/NeuroX, PyTorch,

neural-backed-decision-trees: https://github.com/alvinwan/neural-backed-decision-trees, Pytorch

Outliertree: https://github.com/david-cortes/outliertree, (Python, R, C++),

polyjuice: https://github.com/tongshuangwu/polyjuice, (Pytorch),

pytorch-cnn-visualizations: https://github.com/utkuozbulak/pytorch-cnn-visualizations, Pytorch

Pytorch-grad-cam: https://github.com/jacobgil/pytorch-grad-cam, Pytorch

PDPbox: https://github.com/SauceCat/PDPbox, Scikit-learn

py-ciu:https://github.com/TimKam/py-ciu/,

PyCEbox: https://github.com/AustinRochford/PyCEbox

path_explain: https://github.com/suinleelab/path_explain, Tensorflow

rulefit: https://github.com/christophM/rulefit,

rulematrix: https://github.com/rulematrix/rule-matrix-py,

Saliency: https://github.com/PAIR-code/saliency, Tensorflow

SHAP: https://github.com/slundberg/shap, Nearly all platform on Python

Shapley: https://github.com/benedekrozemberczki/shapley,

Skater: https://github.com/oracle/Skater

TCAV: https://github.com/tensorflow/tcav, Tensorflow, scikit-learn

skope-rules: https://github.com/scikit-learn-contrib/skope-rules, Scikit-learn

TensorWatch: https://github.com/microsoft/tensorwatch.git, Tensorflow

tf-explain: https://github.com/sicara/tf-explain, Tensorflow

Treeinterpreter: https://github.com/andosa/treeinterpreter, scikit-learn,

torch-cam: https://github.com/frgfm/torch-cam, Pytorch,

WeightWatcher: https://github.com/CalculatedContent/WeightWatcher, Keras, Pytorch

What-if-tool: https://github.com/PAIR-code/what-if-tool, Tensorflow

XAI: https://github.com/EthicalML/xai, scikit-learn

Related Repositories

https://github.com/jphall663/awesome-machine-learning-interpretability,

https://github.com/lopusz/awesome-interpretable-machine-learning,

https://github.com/pbiecek/xai_resources,

https://github.com/h2oai/mli-resources,

https://github.com/AstraZeneca/awesome-explainable-graph-reasoning,

https://github.com/utwente-dmb/xai-papers,

Acknowledge

Need your help to re-organize and refine current taxonomy. Thanks very very much!

I appreciate it very much if you could add more works related to XAI/XML to this repo, archive uncategoried papers or anything to enrich this repo.

If any questions, feel free to contact me(yongjie.wang@ntu.edu.sg) or discuss on Gitter Chat. Welcome to discuss together.

Stargazers over time