NielsRogge/Transformers-Tutorials

UDOP Demo notebooks

AurelienVaudois opened this issue · 2 comments

There seems to be a problem with the demo notebook for udop, I can't launch a training of the model I have the following error message :

IndexError Traceback (most recent call last)

in <cell line: 20>()
33
34 # forward pass
---> 35 outputs = model(
36 input_ids=input_ids,
37 attention_mask=attention_mask,

6 frames

/usr/local/lib/python3.10/dist-packages/transformers/models/udop/modeling_udop.py in combine_image_text_embeddings(image_embeddings, inputs_embeds, bbox, visual_bbox, attention_mask, num_patches, max_len, image_size, patch_size)
318 sequence_length = num_patches
319 ocr_points_x = torch.clip(
--> 320 torch.floor((bbox[:, :, 0] + bbox[:, :, 2]) / 2.0 * sequence_length).long(), 0, sequence_length - 1
321 )
322 ocr_points_y = (

IndexError: too many indices for tensor of dimension 2

Hi,

Make sure that the bounding boxes have the appropriate shape: (batch_size, seq_len, 4) - as we need 4 coordinates per token.

Hi Niels,

Thank you for your answer. I have inspected the bboxes of the train_dataloader and they are all in the form [1, seq_len, 4] except for some which are in the form [1, 4]. (See second screen). Is this the cause of the issue?

Screenshot_2024-03-11-00-19-31-09_40deb401b9ffe8e1df2f1cc5ba480b12
Screenshot_2024-03-11-00-19-04-22_40deb401b9ffe8e1df2f1cc5ba480b12