RuntimeError: shape mismatch: value tensor of shape [4194304] cannot be broadcast to indexing result of shape [0]

Question

RuntimeError: shape mismatch: value tensor of shape [4194304] cannot be broadcast to indexing result of shape [0]

Closed this issue 7 days ago · 3 comments

blip2 appears RuntimeError: shape mismatch: value tensor of shape [4194304] cannot be broadcast to indexing result of shape [0] after update.
An error occurred while calling blip2-opt-6.7b-coco while executing script /01_caption，
The following is the specific error code：

Processing model: /home/JJ_Group/mojr/lavad/caption_model/blip2-opt-6.7b-coco
Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]
Loading checkpoint shards: 25%|██▌ | 1/4 [00:30<01:31, 30.51s/it]
Loading checkpoint shards: 50%|█████ | 2/4 [00:58<00:57, 28.81s/it]
Loading checkpoint shards: 75%|███████▌ | 3/4 [01:26<00:28, 28.50s/it]
Loading checkpoint shards: 100%|██████████| 4/4 [01:30<00:00, 18.83s/it]
Loading checkpoint shards: 100%|██████████| 4/4 [01:30<00:00, 22.57s/it]

Processing /home/JJ_Group/mojr/lavad/dataset/frames/test_video1: 0%| | 0/21 [00:00<?, ?batch/s]
Processing /home/JJ_Group/mojr/lavad/dataset/frames/test_video1: 0%| | 0/21 [00:03<?, ?batch/s]
Traceback (most recent call last):
File "/home/JJ_Group/mojr/.conda/envs/lavad/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/JJ_Group/mojr/.conda/envs/lavad/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/JJ_Group/mojr/lavad/src/models/image_captioner.py", line 131, in
run(
File "/home/JJ_Group/mojr/lavad/src/models/image_captioner.py", line 103, in run
captioner.process_video(video)
File "/home/JJ_Group/mojr/lavad/src/models/image_captioner.py", line 62, in process_video
generated_ids = self.model.generate(**batch_inputs)
File "/home/JJ_Group/mojr/.conda/envs/lavad/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/JJ_Group/mojr/.conda/envs/lavad/lib/python3.10/site-packages/transformers/models/blip_2/modeling_blip_2.py", line 2316, in generate
inputs_embeds[special_image_mask] = language_model_inputs.flatten()
RuntimeError: shape mismatch: value tensor of shape [4194304] cannot be broadcast to indexing result of shape [0]

It can run normally in August 2024, but the blip2 model seems to have been updated in November 2024, which leads to the above problem. How can I solve it?

Answer 1 · 2024-12-04T15:00:17.000Z

You need to update to the latest transformer to resolve the issue.Execute the following code update：

pip install --upgrade git+https://github.com/huggingface/transformers.git

Answer 2 · 2024-12-04T16:57:13.000Z

As suggested by @hedachun321 (thanks!), the problem is probably related to the version of transformers and the cached BLIP-2 model. @Gerry048, what version of transformers are you currently using? You can reinstall the same version of transformers you had in August when you downloaded the BLIP-2 model (maybe the one in the requirements.txt file) or upgrade to the latest version of transformers. If you choose to upgrade, you should also delete the cached BLIP-2 model so that the library can download the new version.

Answer 3 · 2024-12-07T07:39:46.000Z

Thank you very much for solving the problem！！