/WhiteBox-Part1

In this part, i've introduced and experimented with ways to interpret and evaluate models in the field of image. (Pytorch)

Primary LanguageJupyter NotebookMIT LicenseMIT

WhiteBox - Part1

The White Box Project is a project that introduces many ways to solve the part of the black box of machine learning. In this part, i've introduced and experimented with ways to interpret and evaluate models in the field of image.

I shared Korean versions for each reference to study methodology and English. Please refer to the reference.
참고자료별로 영어공부겸 한국어로 번역한 자료가 있습니다.

Keywords : Explainable AI, XAI, Interpretable AI, Attribution Method, Saliency Map

Requirements

pytorch >= 1.2.0
torchvision == 0.4.0

How to Run

Model Train

python main.py --train --target=['mnist','cifar10']

Model Selectivity Evaluation

python main.py --eval=selectivity --target=['mnist','cifar10'] --method=['VGB','IB','DeconvNet','IG','GB','GC','GBGC']

Model ROAR & KAR Evaluation
For ROAR and KAR, the saliency map of each attribution methods that you want to evaluate must be saved prior to the evaluation.

python main.py --eval=['ROAR','KAR'] --target=['mnist','cifar10'] --method=['VGB','IB','DeconvNet','IG','GB','GC','GBGC']

Dataset

  • MNIST
  • CIFAR-10

Saliency Maps

Attribution Methods

Ensemble Methods

  • SmoothGrad (SG) [5]
  • SmoothGrad-Squared (SG-SQ) [6]
  • SmoothGrad-VAR (SG-VAR) [6]

Evaluation Methods

  • Coherence
  • Selectivity
  • Remove and Retrain (ROAR) [6]
  • Keep and Retrain (KAR) [6]

Experiments

Model Architecture & Performance

[Notebook]

Architecture MNIST CIFAR-10
simple_cnn_architecture

Evaluation Results

Coherence

[Notebook]

Coherence is a qualitative evaluation method that shows the importance of images. Attributions should fall on discriminative features (e.g. the object of interest).

MNIST

mnist_coherence

CIFAR-10

cifar10_coherence

Selectivity

[Notebook]

Selecticity is a method for quantitative evaluation of the attribution methods. The evaluation method is largely divided into two courses. First, the feature map for the image is created and the most influential part is deleted from the image. The second is to create the feature map again with the modified image and repeat the first process.

As a result, IB, GB and GB-GC were the most likely attribution methods to degrade the performance of models for the two datasets.

selectivity

MNIST

CIFAR-10

ROAR/KAR

ROAR/KAR is a method for quantitative evaluation of the attribution methods that how the performance of the classifier changes as features are removed based on the attribution method.

  • ROAR : replace N% of pixels estimated to be most important [Notebook]
  • KAR : replace N% of pixels estimated to be least important
  • Retrain Model and measure change in test accuracy

Reference