Integrated Gradients for the Inception (v1) Network

This repository provides an implementation of the Integrated Gradients methods on the Inception (v1) network Inception (v1) object recognition network.

Integrated Gradients is a method for attributing a deep network's prediction to its input features. It was proposed by this paper published at ICML 2017. In a nutshell, the idea is to examine the gradients of inputs obtained by interpolating on a straightline path between the input at hand and a baseline input, and then aggregate these gradients together. The resulting values form an attribution of the prediction to the input features. In an object recognition network, such attributions could tell us which pixels of the image were responsible for a certain label being picked

The method is widely applicable, requires no modification to the original network and is extremely simple to implement. Additionally, the method is backed by an axiomatic jusfication and has some nice mathematical properties. We recommend reading the paper for more details.

This repository provides an implementation of integrated gradients along with methods for visualizing them.

Running the code

The code for generating and visualizing integrated gradients is in a single Jupyter notebook --- attributions.ipynb.

To run the notebook, please follow the following instructions.

Clone this repository

git clone https://github.com/ankurtaly/Attributions

In the same directory, run the Jupyter notebook server.
```
jupyter notebook
```
Instructions for installing Jupyter are available here. Please make sure that you have TensorFlow, NumPy, and PIL.Image installed for Python 2.7.
Open attributions.ipynb and run all cells.

Visualizations

Below are some visualizations of interior gradients (as a GIF) and integrated gradients for some images from the ImageNet object recognition dataset. For comparison, we also show a visualization of the gradients at the actual image.