Summarize video
Closed this issue · 1 comments
linchen111 commented
If my multi is continuous images from like screenshots, what should be my prompt when I use Mistral-7B-LoRA-Multi-VisionCLIPPool-LLAVA
sshh12 commented
This is the format I used:
<image><image><image> What is happening in these frames?
Although not sure how well it work given my training data was mainly compare/contrast rather than video understanding.
It's only trained for up to 6 images, may work for more though.