Empirical Explainers

This repo contains the refactored code and pointers to the data accompanying our paper "Efficient Explanations from Empirical Explainers".

Below, please find (1) pointers to the heatmaps promised in the paper, (2) instructions on how to replicate our experiments and (3) all our data, models and logs.

Disclaimer: This is a refactored and prettified version of the code and has not been tested exhaustively yet. This disclaimer will be removed when the tests have been conducted.

💾 Download the Explanations / Experiments

Here are the explanations and experiments.

Explanations

Explanations are contained in the folder explanations. Explanations are the attribution maps returned by the expensive target explainers and the Empirical Explainers. Each line contains an HTML document. Your browser probably will not be able to load all data at once. You can extract individual lines into separate files to solve that issue.

Experiments

Our experiments (data, models, logs) can be downloaded from the folder experiments or replicated as described below. Please note that the archives in experiments range in size from 1 - 10 GB.

📊 Replicate our Experimental Results

You can replicate our experimental results, following the steps below or download our data, model and logs, following the links at the bottom. Our experiments are config-driven.

📁 Directories and File Structure

We assume a base directory for all experiments with 5 sub folders:

|- data: The raw data is downloaded here, using Huggingface's (HF) datasets.
|- models: Model parameters are saved here.
|- explanations: Saliency maps are saved here as json lines.
|- visualizations: We save attribution maps in html format here.
|- logs: Training, explanation and visualization logs are saved here.

📚 Install the dependencies

Install pytorch / torchvision
Install the requirements listed in requirements.txt
- A complete list of the exact requirements is contained in pip.freeze.txt

🏃 Running jobs

All jobs are coordinated using the flags and pointers in ./configs/run_job.jsonnet. To run a job, adapt the config and then run, e.g.

CUDA_VISIBLE_DEVICES=0 python run_job.py 2>&1 | tee -a <base-dir>/logs/job.log

💾 Download the Data

In run_job.jsonnet, please set the download flag to true and let path_config_download point to a download configuration file. Example files are provided in ./configs/snli/download/ and ./configs/imdb/download/. We annotated

./configs/imdb/download/download.imdb.validation.jsonnet

We save the training data to |-data/train/, the validation data to |-data/validation/and the test data to |data/test/.

💻 Train the downstream model

We train models using PyTorch Ignite in conjunction w/ HF's Transformers.

The training is again config-driven. We annotated the fields in

./configs/imdb/train/train.downstream.jsonnet.

After training, the model can be found in |-models.

🏭 Expensively explain the downstream model

We explain models using PyTorch's Captum. The training is again config driven, an annotated config is provided in

./configs/imdb/explain/explain.expensive.jsonnet

Explanations are written to json lines. The json file can be found in |-explanations/, after the job is done. At a minimum, an explanation contains the fields input_ids and attributions.

💻 Train the Empirical Explainer

The training of the Empirical Explainer uses the same script as the training of the downstream model. An annotated config file is provided in

./configs/train/train.empirical-explainer.jsonnet

The model weights can be found in |-models/ after training.

🌄 Empirically explain the downstream model

The efficient explanations are generated with the same script as the expensive explanations. An annotated config file is provided in

./configs/imdb/explain/explain.empirical.jsonnet

Explanations are written to json lines again, which can be found in |-explanations after the job is done.

🎓 Evaluate

Evaluation statistics are logged, which requires you to pipe the logger output into a log, e.g. using

2>&1 | tee -a <base-dir>/logs/my-log.log

An example config file is provided

./configs/imdb/evaluate/evaluate.jsonnet

The MSEs between the specified target explanations and the approximative explanations are logged, as is the weighted F1 score, which is derived from the true labels and predictions that should be contained in the expensive target explanations.

🎨 Visualize

Finally, you can visualize explanations. For this, specify one or many explanations to load. It is assumed that the instances contained in the json lines appear in the same order in the files you specified.

We again provide an annotated config file

./configs/visualize/visualize.jsonnet

After the job is done, the sub folder |-visualizations/ contains the heatmaps. The heatmaps are written to a file, line by line, where each line is an HTML document that contains the explanations.

DFKI-NLP/emp-exp