jessevig/bertviz

Visualize EncoderDecoderModel with tied encoder and decoder

Bachstelze opened this issue · 0 comments

I want to visualize the tied encoder-decoder from https://gitlab.com/Bachstelze/instructionbert/-/blob/main/instructionBertClass.py with the following code:

from instructionbert import instructionBertClass
from transformers import AutoTokenizer
from bertviz import model_view

model_name = "bert-base-cased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
plainInstructionBERT = instructionBertClass.instructionBERT(model_name, output_attention=True)
model = plainInstructionBERT.sharedModel
tokenizer = AutoTokenizer.from_pretrained(model_name)

input_sentence = "Answer the following question: - number is 54 - debutteam is pittsburgh steelers - draftpick is 166 - birth date is 24 may 1982 - weight is 243 - nfl is wal475737 - debutyear is 2005 - finalteam is new york sentinels - statlabel is tackles sacks interceptions - heightin is 3 - statvalue is 9 0.0 1 - heightft is 6 - college is temple - birth place is pottstown , pennsylvania - draftyear is 2005 - position is linebacker - draftround is 5 - finalyear is 2009 Given the details above, guess who could this information be about.\nAnswer:"
target_output = "The information provided seems to refer to Rian Wallace, a former NFL player."
# tokenize the inputs
encoder_input_ids = tokenizer(input_sentence, return_tensors="pt", add_special_tokens=True).input_ids
with tokenizer.as_target_tokenizer():
    decoder_input_ids = tokenizer(target_output, return_tensors="pt", add_special_tokens=True).input_ids

# feed the inputs into the model
outputs = model(input_ids=encoder_input_ids, decoder_input_ids=decoder_input_ids)

encoder_text = tokenizer.convert_ids_to_tokens(encoder_input_ids[0])
decoder_text = tokenizer.convert_ids_to_tokens(decoder_input_ids[0])

# visualize the attention weights
model_view(
    encoder_attention=outputs.encoder_attentions,
    decoder_attention=outputs.decoder_attentions,
    cross_attention=outputs.cross_attentions,
    encoder_tokens= encoder_text,
    decoder_tokens = decoder_text
)

and i get the this error:

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

[<ipython-input-4-035199f57d6e>](https://localhost:8080/#) in <cell line: 3>()
      1 from bertviz import model_view, head_view
      2 # visualize the attention weights
----> 3 head_view(
      4     encoder_attention=outputs.encoder_attentions,
      5     decoder_attention=outputs.decoder_attentions,

[/usr/local/lib/python3.10/dist-packages/bertviz/head_view.py](https://localhost:8080/#) in head_view(attention, tokens, sentence_b_start, prettify_tokens, layer, heads, encoder_attention, decoder_attention, cross_attention, encoder_tokens, decoder_tokens, include_layers, html_action)
    160             )
    161     else:
--> 162         raise ValueError("You must specify at least one attention argument.")
    163 
    164     if layer is not None and layer not in include_layers:

ValueError: You must specify at least one attention argument.

Is this encoder-decoder handled like an encoder and i should only define one attention for the view?