towhee-io/towhee

目前使用clip4clip在时间均匀采样下只能提取12帧吗

Terran0629 opened this issue · 4 comments

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

我想知道clip4clip对一段视频进行向量化时最多只能提取12帧,多余12帧的话会报错

i

mport towhee
from towhee import pipe,ops
from time import time

device = 'cpu'
pipes = (
    pipe.input('path')
    .map('path','flames',ops.video_decode.ffmpeg(sample_type='uniform_temporal_subsample', args={'num_samples': 12})) #
    .map('flames','vec',ops.video_text_embedding.clip4clip(model_name='clip_vit_b32', modality='video', device=device))
    .output('vec')
)
# video_path = r'E:\video_danganguan\0451-2001-006-1139.mp4'
video_path = r'E:\data\video\cartoon.mp4'
s = time()
a = pipes(video_path)
e = time()
print('time:',e-s)

Why is this needed?

No response

Anything else?

No response

hi ,代码将最大帧限制为 12:https://towhee.io/video-text-embedding/clip4clip/src/branch/main/clip4clip.py

Thanks for your reply! I'm modifying the frame rate in the source code, so it shouldn't affect its functionality, right?

Changing the frame rate does not affect the model's operation, but the model was trained with 12 frames, so altering it may impact the results.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Close the stale issues and pull requests after 7 days of inactivity. Reopen the issue with /reopen.