/Eval_XAI_Robustness

Evaluation robustness of neural network interpretation

Primary LanguagePython

Eval_XAI_Robustness

This is the repository for paper "SAFARI: Versatile and Efficient Evaluations for Robustness of Interpretability". It proposes two evaluation metrics for robustness of interpretation from worst-case and probabilistic perspective, resepectively. The popular XAI methods, such as Integreted Gradient, LRP, and DeepLift are supported for the evaluation.

Environment Setup

Requires Linux Platform with Python 3.8.5. We recommend to use anaconda for creating virtual environment. requirements.txt file contains the python packages required for running the code. Follow below steps for installing the packages:

  • Create virtual environment and install necessary packages

    conda create -n eval_xai --file requirements.txt

  • Activate virtual environment

    conda activate eval_xai

Files

  • model Directory contains scripts for training test models
  • checkpoints Directory contains saved checkpoints for pre_trained test models

Note:

  • We only include a pre-trained test model for MNIST dataset due to the file size limit. For other dataset, please train the test models first.

  • You may get error 'zipfile.BadZipFile: File is not a zip file' when downloading CelebA dataset. Google Drive has a daily maximum quota for any file. Try to mannually download from here and unzip the dataset. Move to the folder Datasets/celeba

How To Use

The tool can be run for XAI robustness evaluation, test model training with the following commands.

Quick Start

You can quickly run the worst-case robustness evaluation on intrepation by Gradient x Input, using Genetic Algorithm as optimizier:

python main.py --eval_metric ga

or probabilistic robustness evaluation on intrepation by Gradient x Input, using Subset Simulation as sampling method:

python main.py --eval_metric ss