Inseq is a Pytorch-based hackable toolkit to democratize the access to common post-hoc interpretability analyses of sequence generation models.
Inseq is available on PyPI and can be installed with pip
:
# Install latest stable version
pip install inseq
# Alternatively, install latest development version
pip install git+https://github.com/inseq-team/inseq.git
Install extras for visualization in Jupyter Notebooks and 🤗 datasets attribution as pip install inseq[notebook,datasets]
.
Dev Installation
To install the package, clone the repository and run the following commands:cd inseq
make poetry-download # Download and install the Poetry package manager
make install # Installs the package and all dependencies
If you have a GPU available, use make install-gpu
to install the latest torch
version with GPU support.
For library developers, you can use the make install-dev
command to install and its GPU-friendly counterpart make install-dev-gpu
to install all development dependencies (quality, docs, extras).
After installation, you should be able to run make fast-test
and make lint
without errors.
FAQ Installation
-
Installing the
tokenizers
package requires a Rust compiler installation. You can install Rust from https://rustup.rs and add$HOME/.cargo/env
to your PATH. -
Installing
sentencepiece
requires various packages, install withsudo apt-get install cmake build-essential pkg-config
orbrew install cmake gperftools pkg-config
.
This example uses the Integrated Gradients attribution method to attribute the English-French translation of a sentence taken from the WinoMT corpus:
import inseq
model = inseq.load_model("Helsinki-NLP/opus-mt-en-fr", "integrated_gradients")
out = model.attribute(
"The developer argued with the designer because her idea cannot be implemented.",
n_steps=100
)
out.show()
This produces a visualization of the attribution scores for each token in the input sentence (token-level aggregation is handled automatically). Here is what the visualization looks like inside a Jupyter Notebook:
Inseq also supports decoder-only models such as GPT-2, enabling usage of a variety of attribution methods and customizable settings directly from the console:
import inseq
model = inseq.load_model("gpt2", "integrated_gradients")
model.attribute(
"Hello ladies and",
generation_args={"max_new_tokens": 9},
n_steps=500,
internal_batch_size=50
).show()
-
🚀 Feature attribution of sequence generation for most
ForConditionalGeneration
(encoder-decoder) andForCausalLM
(decoder-only) models from 🤗 Transformers -
🚀 Support for multiple feature attribution methods, sourced in part from Captum
-
🚀 Post-processing of attribution maps via
Aggregator
classes. -
🚀 Attribution visualization in notebooks, browser and command line.
-
🚀 Attribute single examples or entire 🤗 datasets with the Inseq CLI.
-
🚀 Custom attribution of target functions, supporting advanced use cases such as contrastive and uncertainty-weighted feature attributions.
-
🚀 Extraction and visualization of custom step scores (e.g. probability, entropy) alongsides attribution maps.
Use the inseq.list_feature_attribution_methods
function to list all available method identifiers and inseq.list_step_functions
to list all available step functions. The following methods are currently supported:
-
saliency
: Saliency (Simonyan et al., 2013) -
input_x_gradient
: Input x Gradient (Simonyan et al., 2013) -
integrated_gradients
: Integrated Gradients (Sundararajan et al., 2017) -
deeplift
: DeepLIFT (Shrikumar et al., 2017) -
gradient_shap
: Gradient SHAP (Lundberg and Lee, 2017) -
discretized_integrated_gradients
: Discretized Integrated Gradients (Sanyal and Ren, 2021)
attention
: Attention Weight Attribution (Bahdanau et al., 2014)
Step functions are used to extract custom scores from the model at each step of the attribution process with the step_scores
argument in model.attribute
. They can also be used as targets for attribution methods relying on model outputs (e.g. gradient-based methods) by passing them as the attributed_fn
argument. The following step functions are currently supported:
logits
: Logits of the target token.probability
: Probability of the target token.entropy
: Entropy of the predictive distribution.crossentropy
: Cross-entropy loss between target token and predicted distribution.perplexity
: Perplexity of the target token.contrast_prob
: Probability of the target token when different contrastive inputs are provided to the model. Equivalent toprobability
when no contrastive inputs are provided.pcxmi
: Point-wise Contextual Cross-Mutual Information (P-CXMI) for the target token given original and contrastive contexts (Yin et al. 2021).kl_divergence
: KL divergence of the predictive distribution given original and contrastive contexts. Can be restricted to most likely target token options using thetop_k
andtop_p
parameters.contrast_prob_diff
: Difference in probability between original and foil target tokens pair, can be used for contrastive evaluation as in Contrastive Attribution (Yin and Neubig, 2022).mc_dropout_prob_avg
: Average probability of the target token across multiple samples using MC Dropout (Gal and Ghahramani, 2016).top_p_size
: The number of tokens with cumulative probability greater thantop_p
in the predictive distribution of the model.
The following example computes contrastive attributions using the contrast_prob_diff
step function:
import inseq
attribution_model = inseq.load_model("gpt2", "input_x_gradient")
# Perform the contrastive attribution:
# Regular (forced) target -> "The manager went home because he was sick"
# Contrastive target -> "The manager went home because she was sick"
out = attribution_model.attribute(
"The manager went home because",
"The manager went home because he was sick",
attributed_fn="contrast_prob_diff",
contrast_targets="The manager went home because she was sick",
# We also visualize the corresponding step score
step_scores=["contrast_prob_diff"]
)
out.show()
Refer to the documentation for an example including custom function registration.
The Inseq library also provides useful client commands to enable repeated attribution of individual examples and even entire 🤗 datasets directly from the console. See the available options by typing inseq -h
in the terminal after installing the package.
For now, two commands are supported:
-
ìnseq attribute
: Wraps theattribute
method shown above, requires explicit inputs to be attributed. -
inseq attribute-dataset
: Enables attribution for a full dataset using Hugging Facedatasets.load_dataset
.
Both commands support the full range of parameters available for attribute
, attribution visualization in the console and saving outputs to disk.
Example: The following command can be used to perform attribution (both source and target-side) of Italian translations for a dummy sample of 20 English sentences taken from the FLORES-101 parallel corpus, using a MarianNMT translation model from Hugging Face transformers
. We save the visualizations in HTML format in the file attributions.html
. See the --help
flag for more options.
inseq attribute-dataset \
--model_name_or_path Helsinki-NLP/opus-mt-en-it \
--attribution_method saliency \
--do_prefix_attribution \
--dataset_name inseq/dummy_enit \
--input_text_field en \
--dataset_split "train[:20]" \
--viz_path attributions.html \
--batch_size 8 \
--hide
-
⚙️ Support more attention-based and occlusion-based feature attribution methods (documented in #107 and #108).
-
⚙️ Interoperability with ferret for attribution plausibility and faithfulness evaluation.
-
⚙️ Rich and interactive visualizations in a tabbed interface using Gradio Blocks.
Our vision for Inseq is to create a centralized, comprehensive and robust set of tools to enable fair and reproducible comparisons in the study of sequence generation models. To achieve this goal, contributions from researchers and developers interested in these topics are more than welcome. Please see our contributing guidelines and our code of conduct for more information.
If you use Inseq in your research we suggest to include a mention to the specific release (e.g. v0.4.0) and we kindly ask you to cite our reference paper as:
@article{sarti-etal-2023-inseq,
author = {Gabriele Sarti and Nils Feldhus and Ludwig Sickert and Oskar van der Wal and Malvina Nissim and Arianna Bisazza},
title = {Inseq: An Interpretability Toolkit for Sequence Generation Models},
month = feb,
year = 2023,
journal = {ArXiv},
volume = {abs/2302.13942},
url = {https://arxiv.org/abs/2302.13942}
}
Inseq has been used in various research projects. A list of known publications that use Inseq to conduct interpretability analyses of generative models is shown below. If you know more, please let us know or submit a pull request (last updated: May 2023).
2023
- Inseq: An Interpretability Toolkit for Sequence Generation Models (Sarti et al., 2023)
- Are Character-level Translations Worth the Wait? Comparing Character- and Subword-level Models for Machine Translation (Edman et al., 2023)
- Response Generation in Longitudinal Dialogues: Which Knowledge Representation Helps? (Mousavi et al., 2023)