jessevig/bertviz

bumpy.int64 object is not callable

cgray1117 opened this issue · 5 comments

It appears in the num_head function of util.py, it's set to return the size of attention[0][0] but it is treating size as a function where it shouldn't be? Maybe something is wrong on my end but I've been working on visualizations for two days now and nothing seems to make the code work right. Please help.

Hi @cgray1117, sorry for the troubles. Can you verify that the attention parameter is of this format?:

list of torch.FloatTensor(one for each layer) of shape
(batch_size(must be 1), num_heads, sequence_length, sequence_length)

Otherwise, would you be willing to share your code? Thanks!

from transformers import utils
from bertviz import model_view
from tensorflow.python.ops.numpy_ops import np_config
utils.logging.set_verbosity_error() # Suppress standard warnings

#model_name = "microsoft/xtremedistil-l12-h384-uncased" # Find popular HuggingFace models here: https://huggingface.co/models
input_text = X_train[1]
model = TFRobertaModel.from_pretrained(Ro_MODEL_NAME, output_attentions=True) # Configure model to return attention values
#tokenizer = AutoTokenizer.from_pretrained(model_name)
inputs = Ro_tokenizer.encode(input_text, return_tensors='pt') # Tokenize input text
outputs = model(inputs) # Run model
attention = list(outputs[-1]) # Retrieve attention from model outputs
tokens = Ro_tokenizer.convert_ids_to_tokens(inputs[0]) # Convert input ids to token strings

#model_view(attention, tokens) # Display model view
print(type(attention[0][0]))

here's the code chunk I'm using. Please let me know if you see where I am going wrong.

Hmm, it looks good on the surface. What is the output of that last print statement?

<class 'tensorflow.python.framework.ops.EagerTensor'>

Hi @cgray1117 sorry for the delayed response. The attention returned from a TF model is a TF tensor and needs to be converted to a torch tensor. I'll update the code or documentation to make this easier in the future. In the meantime you can convert the attention object from TF to PT as shown in the following:

from transformers import utils, TFBertModel, AutoTokenizer
from bertviz import model_view
from tensorflow.python.ops.numpy_ops import np_config
import torch
utils.logging.set_verbosity_error() # Suppress standard warnings
import numpy as np

#model_name = "microsoft/xtremedistil-l12-h384-uncased" # Find popular HuggingFace models here: https://huggingface.co/models
input_text = 'The cat sat on the mat.'
model = TFBertModel.from_pretrained('bert-base-uncased', output_attentions=True) # Configure model to return attention values
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
inputs = tokenizer.encode(input_text, return_tensors='tf') # Tokenize input text
outputs = model(inputs) # Run model
attention = outputs[-1] # Retrieve attention from model outputs
tokens = tokenizer.convert_ids_to_tokens(inputs[0]) # Convert input ids to token strings

np_attention = [att.numpy() for att in attention]
pt_attention = [torch.from_numpy(att) for att in np_attention]

model_view(pt_attention, tokens) # Display model view