FlashTorch 🔦

Visualisation toolkit implemented in PyTorch for inspecting what neural networks learn in image recognition tasks (feature visualisation).

The project is very much work in progress, and I would appreciate your feedback!

It currently supports visualisation of saliancy maps for all the models available under torchvision.models.

Citation

Misa Ogura. (2019, July 8). MisaOgura/flashtorch: 0.0.8 (Version v0.0.8). Zenodo. http://doi.org/10.5281/zenodo.3271410

Installation

$ pip install flashtorch

Example notebooks

Image handling

Notebook: Image handling

Saliency maps

Notebook: Image-specific class saliency map with backpropagation

Notebook also available on Google Colab - probably the best way to play around quickly, as there is no need for setting up the environment!

Saliency in human visual perception is a subjective quality that makes certain things within the field of view stand out from the rest and grabs our attention.

Saliency maps in computer vision provide indications of the most salient regions within images. By creating a saliency map for neural networks, we can gain some intuition on "where the network is paying the most attention to" in an imput image.

AlexNet visualisation

Using flashtorch.saliency module, let's visualise image-specific class saliency maps of AlexNet pre-trained on ImageNet classification tasks.

Great gray owl (class index 24): The network is focusing on the sunken eyes and the round head for this owl.

Peacock (class index 84): But it doesn't always focus on the eyes and head of an animal. In its world's view, what makes peacok a peacok is the eyespots on its tail!

Toucan (class index 96): And in case of a toucan, the network is paying an intense attention on its beak.

Do you agree? 🤖

Insignts on transfer learning

In the example above, we've visualised saliency maps for a network that has been trained on ImageNet and used images of objects which it already knows.

We can take a step further and investigate how the network's perception changes before and after the training, when presented by a new object.

This time, I'm going to use DenseNet, which is again pre-trained on ImageNet (1000 classes), and train it into a flower classifier to recognise 102 species of flowers (dataset).

With no additional training, and just by swapping out the last fully-connected layer, the model performs very poorly (0.1% test accuracy). By plotting the gradients, we can see that the network is mainly focusing on the shape of the flower.

Foxgloves as an example:

With training, the model now achieves 98.7% test accuracy. But why? What is it that it's seeing now, that it wasn't before?

The network has learnt to shift its focus on the mottle patten within flower cups! In it's world's view, that is the most distinguishing things about this object, which I think closely align with what we deem the most unique trait of this flower.