Can videollama2 continue finetuning on my own dataset using 32 frames?
zhengrongz opened this issue · 2 comments
zhengrongz commented
Hi! Thanks for your excellent work!
I wonder know whether I can use 32 frames per video to finetune model on my own dataset?
If true, do I just need to change the number of sampled frames in constant?
Looking forward to your reply!
lixin4ever commented
Yes, I believe it is fine to do so. In our internal evaluations, we found that our video models can generalize well to longer input (i.e., more input frames), and they usually perform better given the longer input.
You can specify this argument explicitly in your own script to support the training with more video frames.
zhengrongz commented
@lixin4ever OK thank you!