microsoft/VideoX

About Frame Sampling

yusufani opened this issue · 3 comments

Hi,

Thank you for X-Clip project.
While running the code in Huggingface to try it out, I noticed that 8 frames are sampled sequentially. Do these frames have to be sequential or would it make sense to randomly get 8 frames in 1 second?

image

nbl97 commented

Thanks for your interest. In the original paper, the frames are sampled using a sparse strategy, i.e., the frames are uniformly sampled to capture the global information. In your code snippet, you can control the interval by frame_sample_rate. Hope this can help you.

Thank you for your answer. I tried with a frame for each second and it works quite well 🥳

nbl97 commented

Pls feel free to ping me if there are further questions.