Tool for visualizing attention in BERT and OpenAI GPT-2. Extends Tensor2Tensor visualization tool and pytorch-pretrained-BERT.
Blog posts:
- Deconstructing BERT: Distilling 6 Patterns from 100 Million Parameters
- Deconstructing BERT Part 2: Visualizing the Inner Workings of Attention
- OpenAI GPT-2: Understanding Language Generation through Visualization
Paper:
The attention-head view visualizes the attention patterns produced by one or more attention heads in a given transformer layer.
BERT: [Notebook] [Colab]
OpenAI GPT-2: [Notebook] [Colab]
The model view provides a birds-eye view of attention across all of the model’s layers and heads.
BERT: [Notebook] [Colab]
OpenAI GPT-2 [Notebook] [Colab]
The neuron view visualizes the individual neurons in the query and key vectors and shows how they are used to compute attention.
BERT: [Notebook] [Colab]
OpenAI GPT-2 [Notebook] [Colab]
When referencing BertViz, please cite this paper.
@article{vig2019transformervis,
author = {Jesse Vig},
title = {Visualizing Attention in Transformer-Based Language Representation Models},
journal = {arXiv preprint arXiv:1904.02679},
year = {2019},
url = {https://arxiv.org/abs/1904.02679}
}
This project is licensed under the Apache 2.0 License - see the LICENSE file for details
This project incorporates code from the following repos: