使用cogvlm2-video的CLI demo报错Exception has occurred: RuntimeError: view size is not compatible with input tensor's size and stride
Celine-hxy opened this issue · 3 comments
Celine-hxy commented
System Info / 系統信息
寒武纪 pytorch2.1 python=3.10
Who can help? / 谁可以帮助到您?
No response
Information / 问题信息
- The official example scripts / 官方的示例脚本
- My own modified scripts / 我自己修改的脚本和任务
Reproduction / 复现过程
- python /home/user/CogVLM2_mlu/video_demo/cli_video_demo.py
- Input this video "https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/baby.mp4"
Exception has occurred: RuntimeError
view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
File "/root/.cache/huggingface/modules/transformers_modules/cogvlm2-video-llama3-chat/visual.py", line 78, in forward
output = self.dense(out.view(B, L, -1))
File "/root/.cache/huggingface/modules/transformers_modules/cogvlm2-video-llama3-chat/visual.py", line 114, in forward
attention_output = self.input_layernorm(self.attention(attention_input))
File "/root/.cache/huggingface/modules/transformers_modules/cogvlm2-video-llama3-chat/visual.py", line 129, in forward
hidden_states = layer_module(hidden_states)
File "/root/.cache/huggingface/modules/transformers_modules/cogvlm2-video-llama3-chat/visual.py", line 165, in forward
x = self.transformer(x)
File "/root/.cache/huggingface/modules/transformers_modules/cogvlm2-video-llama3-chat/modeling_cogvlm.py", line 374, in encode_images
images_features = self.vision(images[0])
File "/root/.cache/huggingface/modules/transformers_modules/cogvlm2-video-llama3-chat/modeling_cogvlm.py", line 402, in forward
images_features = self.encode_images(images)
File "/root/.cache/huggingface/modules/transformers_modules/cogvlm2-video-llama3-chat/modeling_cogvlm.py", line 635, in forward
outputs = self.model(
File "/home/user/CogVLM2_mlu/video_demo/cli_video_demo.py", line 137, in <module>
outputs = model.generate(**inputs, **gen_kwargs)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
Expected behavior / 期待表现
.
zRzRzRzRzRzRzR commented
已经测试,没能复现,确定是NV的卡吗
lxslxs1 commented
请问咋解决的?
huangshiyu13 commented
请问咋解决的?
有些卡的view算子实现有问题,在特殊的卡上运行,需要手动把代码里面的view换成reshape