Inference on more than 8 frames

Question

Inference on more than 8 frames

Nihel01 opened this issue 10 months ago · 2 comments

Would it be possible to run inference (or even training?) using more than 8 frames from a video?

If it's posibble could you point us out in where to control this? I have found multiple configs for number of frames, but not sure if one parameter somewhere controls all of them. Thanks.

Answer 1 · 2024-02-26T06:00:25.000Z

The main thing is to change the output of the video encoder. If your video encoder supports multiple frames, then feed it all to LLM.

Answer 2 · 2024-03-29T13:33:01.000Z

Check this reply: #123 (comment)