csuhan/OneLLM

Images and videos with high resolution

codonna9 opened this issue · 3 comments

Thank you for releasing the model & code. Can the model work with images and videos of high resolution like 720x1280, without having to resize them to 224x224?

csuhan commented

Hi @codonna9 , Currently we need to resize the image/video to 224. For higher resolution, you can try our SPHINX model which supports 448 inputs.

Thanks a lot for your reply. I tried Sphinx before but 448 size is still a bit small for high resolution images/videos

csuhan commented

Yeah. It's a trade off between resolution and computation.