timoschick/dino

RuntimeError: The size of tensor a (1024) must match the size of tensor b (1370) at non-singleton dimension 3

Closed this issue · 1 comments

dino.py will raise an error whenever the input sequence is longer than the maximum allowed length (1024).

Full stack trace:

Starting dataset generation with DINO...
Traceback (most recent call last):
File "dino.py", line 162, in
num_entries_per_label=args.num_entries_per_label, batch_size=args.batch_size)
File "dev/dino/modeling.py", line 99, in generate_dataset
generate_with_inputs=generate_with_inputs)
File "dev/dino/modeling.py", line 127, in _generate_dataset_entries
do_sample=True, min_length=self.max_output_length, max_length=self.max_output_length, top_k=self.top_k, top_p=self.top_p
File "dev/dino/modeling.py", line 277, in generate_self_debiasing
output_ids = self._model.generate(**inputs, min_length=min_length, max_length=max_length, **kwargs)
File "dev/dino/venv/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "dev/dino/venv/lib/python3.7/site-packages/transformers/generation_utils.py", line 924, in generate
**model_kwargs,
File "dev/dino/generation.py", line 184, in sample
output_hidden_states=output_hidden_states,
File "dev/dino/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "dev/dino/venv/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 901, in forward
return_dict=return_dict,
File "dev/dino/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "dev/dino/venv/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 746, in forward
output_attentions=output_attentions,
File "dev/dino/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "dev/dino/venv/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 294, in forward
output_attentions=output_attentions,
File "dev/dino/venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "dev/dino/venv/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 239, in forward
attn_outputs = self._attn(query, key, value, attention_mask, head_mask, output_attentions)
File "dev/dino/venv/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 174, in _attn
w = torch.where(mask.bool(), w, self.masked_bias.to(w.dtype))
RuntimeError: The size of tensor a (1024) must match the size of tensor b (1370) at non-singleton dimension 3

After a bit of searching, I've found what seems to be a fairly simple fix by editing line 275 in modeling.py to
max_length = min(1024, max_length + input_length) and I've raised a PR add the change if you could review it thanks.

Hi @Andrewlaw171, thank you for pointing this out! I've reviewed your PR and added one minor suggestion :)