cvlab-columbia/viper

Bug when using BLIP2 models

Closed this issue · 1 comments

Hello!
Thank you for sharing the code of vipergpt. I have noticed that the cropped_image tensor in the ImagePatch function is being divided by 255. However, the BLIP2 model input requires PIL images or tensors that are of the original image scale. Therefore, when using the BLIP2 model, it may be necessary to multiply the cropped_image tensor.

Hi, thanks for catching that! I will update the code accordingly.