/Integrated-Gradients

Attributing predictions made by the Inception network using the Integrated Gradients method

Primary LanguageJupyter Notebook

Integrated Gradients

(a.k.a. Path-Integrated Gradients, a.k.a. Axiomatic Attribution for Deep Networks)

Contact: integrated-gradients AT gmail.com

Contributors (alphabetical, last name):

  • Kedar Dhamdhere (Google)
  • Pramod Kaushik Mudrakarta (U. Chicago)
  • Mukund Sundararajan (Google)
  • Ankur Taly (Google Brain)
  • Jinhua (Shawn) Xu (Verily)

We study the problem of attributing the prediction of a deep network to its input features, as an attempt towards explaining individual predictions. For instance, in an object recognition network, an attribution method could tell us which pixels of the image were responsible for a certain label being picked, or which words from sentence were indicative of strong sentiment.

Applications range from helping a developer debug, allowing analysts to explore the logic of a network, and to give end-user’s some transparency into the reason for a network’s prediction.

Integrated Gradients is a variation on computing the gradient of the prediction output w.r.t. features of the input. It requires no modification to the original network, is simple to implement, and is applicable to a variety of deep models (sparse and dense, text and vision).

Relevant papers and slide decks

  • Axiomatic Attribution for Deep Networks -- Mukund Sundararajan, Ankur Taly, Qiqi Yan, Proceedings of International Conference on Machine Learning (ICML), 2017

    This paper introduced the Integrated Gradients method. It presents an axiomatic justification of the method along with applications to various deep networks. Slide deck

  • Did the model understand the questions? -- Pramod Mudrakarta, Ankur Taly, Mukund Sundararajan, Kedar Dhamdhere, Proceedings of Association of Computational Linguistics (ACL), 2018

    This paper discusses an application of integrated gradients for evaluating the robustness of question-answering networks. Slide deck

Implementing Integrated Gradients

This How-To document describes the steps involved in implementing integrated gradients for an arbitrary deep network.

This repository provideds code for implementing integrated gradients for networks with image inputs. It is structured as follows:

We recommend starting with the notebook. To run the notebook, please follow the following instructions.

  • Clone this repository

    git clone https://github.com/ankurtaly/Attributions
    
  • In the same directory, run the Jupyter notebook server.

    jupyter notebook
    

    Instructions for installing Jupyter are available here. Please make sure that you have TensorFlow, NumPy, and PIL.Image installed for Python 2.7.

  • Open attributions.ipynb and run all cells.