/bertviz

Tool for visualizing attention in the Transformer model (BERT, GPT-2, XLNet, and RoBERTa)

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

BertViz

Tool for visualizing attention in BERT, GPT-2, XLNet, and RoBERTa. Extends Tensor2Tensor visualization tool by Llion Jones and pytorch-transformers from HuggingFace.

Blog posts:

Paper:

Attention-head view

The attention-head view visualizes the attention patterns produced by one or more attention heads in a given transformer layer.

Attention-head view Attention-head view animated

BERT: [Notebook] [Colab]
GPT-2: [Notebook] [Colab]
XLNet: [Notebook] [Colab]
RoBERTa: [Notebook] [Colab]

Model view

The model view provides a birds-eye view of attention across all of the model’s layers and heads.

Model view

BERT: [Notebook] [Colab]
GPT-2 [Notebook] [Colab]
XLNet: [Notebook] [Colab]
RoBERTa: [Notebook] [Colab]

Neuron view

The neuron view visualizes the individual neurons in the query and key vectors and shows how they are used to compute attention.

Neuron view

BERT: [Notebook] [Colab]
GPT-2 [Notebook] [Colab]
RoBERTa [Notebook] [Colab]

Requirements

(See requirements.txt)

Execution

git clone https://github.com/jessevig/bertviz.git
cd bertviz
jupyter notebook

Authors

Citation

When referencing BertViz, please cite this paper.

@article{vig2019transformervis,
  author    = {Jesse Vig},
  title     = {A Multiscale Visualization of Attention in the Transformer Model},
  journal   = {arXiv preprint arXiv:1906.05714},
  year      = {2019},
  url       = {https://arxiv.org/abs/1906.05714}
}

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details

Acknowledgments

This project incorporates code from the following repos: