/All-about-XAI

This repository is all about papers and tools of Explainable AI

All-about-XAI

This repository is all about papers and tools of Explainable AI

Contents

Surveys

Towards better analysis of machine learning models: A visual analytics perspective.Liu, S., Wang, X., Liu, M., & Zhu, J. (2017). Visual Informatics, 1, 48-56.

Visual Interpretability for Deep Learning: a Survey Quanshi Zhang, Song-Chun Zhu (2018) CVPR

interpretable/disentangled middle-layer representations

Towards a rigorous science of interpretable machine learning. F. Doshi-Velez and B. Kim. (2018).

Trends and trajectories for explainable, accountable and intelligible systems: An HCI research agenda. A. Abdul, J. Vermeulen, D. Wang, B. Y. Lim, and M. Kankanhalli,in Proc. SIGCHI Conf. Hum. FactorsComput. Syst. (CHI), 2018, p. 582

most focus on HCI research

A survey of methods for explaining black box models. R. Guidotti, A. Monreale, F. Turini, D. Pedreschi, and F. Giannotti.(2018).

presented a detailed taxonomy of explainability methods according to the type of problem faced.

Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) A. Adadi and M. Berrada,in IEEE Access, vol. 6, pp. 52138-52160, 2018.

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez.arxiv.(2019)

How convolutional neural network see the world - A survey of convolutional neural network visualization methods. Qin, Z., Yu, F., Liu, C., & Chen, X. (2018). ArXiv, abs/1804.11191.

Visualization Systems/Tools

XAI Method

Transparent Model

As long as the model is accurate for the task, and uses a reasonably restricted number of internal components, intrinsic interpretable models are suffcient. Otherwise, use post-hoc methods.

Decision Trees

General Additive Models

Bayesian Models

Post-Hoc Explainability

Including natural language explanations, visualizations of learned models , and explanations by example.

Model Agnostic

Visualization Method

1. Saliency

Interpretable explanations of black boxes by meaningful perturbation, R. C. Fong, A. Vedaldi, in IEEE International Conference on Computer Vision, 2017, pp. 3429–3437.

Real time image saliency for black box classifiers, P. Dabkowski, Y. Gal, in: Advances in Neural Information Processing Systems, 2017, pp. 6967–6976.

2. Sensitivity

Sensitivity refers to how an ANN output is influenced by its input and/or weight perturbations

Opening black box data mining models using sensitivity analysis, P.Cortez and M.J.Embrechts, in Proc. IEEE Symp.Comput.Intell.Data Mining (CIDM), (2011)

Using sensitivity analysis and visualization techniques to open black box data mining models, P. Cortez and M. J. Embrechts,Inf. Sci. (2013).

3. SHAP: SHapley Additive exPlanations

Assign importance values for each feature, for a given prediction based on the game theoretic concept of Shapley values

A unified approach to interpreting model predictions, S.M. Lundberg and S.I. Lee, in Proc. Adv. Neural Inf. Process. Syst., 2017.

4. Partial Dependence Plot (PDP)
  • PDP

Auditing black-box models for indirect influence, P. Adler, C. Falk, S. A. Friedler, T. Nix, G. Rybeck, C. Scheidegger, B. Smith, S. Venkatasubramanian, Knowledge and Information Systems (2018)

  • ICE: Individual Conditional Expectation(extends PDP)

Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation, A. Goldstein, A. Kapelner, J. Bleich, E. Pitkin, Journal of Computational and Graphical Statistics 24 (1) (2015) 44–65.

ICE plots extend PDP, reveal interactions and individual differences by disaggregating the PDP output.

  • PI & ICI

Visualizing the feature importance for black box models, G. Casalicchio, C. Molnar, B. Bischl, Joint European Conference on Machine Learning and Knowledge Discovery in Databases,Springer, 2018, pp. 655–670

5. Surrogate Models

LIME

Interpretability via model extraction. O. Bastani, C. Kim, and H. Bastani. (2017).

TreeView: Peeking into deep neural networks via feature-space partitioning. J. J. Thiagarajan, B. Kailkhura, P. Sattigeri, and K. N. Ramamurthy.(2016)

6. Loss Function Vis

Visualizing the Loss Landscape of Neural Nets. NeurIPS.Li, H., Xu, Z., Taylor, G., & Goldstein, T. (2017).

Model Distillation

Adverserial Attack

Adversarial examples: Attacks and defenses for deep learning. X. Yuan, P. He, Q. Zhu, and X. Li. (2017).

Feature Relevance/Importance Method

Local interpretability

Global is Understanding of the whole logic of a model and follows the entire reasoning leading to all the different possible outcomes.

While local Explaining the reasons for a specific decision or single pre-diction

Model Specific

Tree-based Model

CNN

1. Visualization

(1) Max Activation

Synthesize input pattern that can cause maximal activation of a neuron

(2) Deconvolution(2010)

Finds the selective patterns from the input image that activate a specific neuron in the convolutional layers by projecting the lowdimension neurons'feature maps back to the image dimension

  1. First propose Deconv

    Deconvolutional networks. M. D. Zeiler, D. Krishnan, G. W. Taylor, R. Fergus, in: CVPR, Vol. 10,2010, p. 7.

  2. Use Deconv to visualize CNN

    Visualizing and understanding convolutional net-works. Matthew D. Zeiler and Rob Fer-gus. In ECCV, 2014.

(3) Inversion

Different from the above, which visualize the CNN from a single neuron’s activation, this methods is from Layer-level.

Reconstructs an input image based from a specific layer's feature maps, which reveals what image information is preserved in that layer

(4) Viusalization System: Understanding, Diagnosis, Refinement

2. Using explainable Model

3. Archtecture Modification

RNN

Generative Model

Reinforcement Learning