About input frames and sampling interval
Closed this issue · 1 comments
Thank you for your excellent work! By the way I want to know about clip_len
and frame_interval
for Kinetics. In Appendix A.1, "We evaluate the model on 8, 16, 32 frames and the sampling interval is 16, 8, 4, respectively." Does this mean for kinetics400/700, the data pipeline (train, val, test) should be the same? For example, in configs/recognition/vit/vit_imagenet_k400.py
, the config of data pipeline keeps the same as the paper mentioned.
i.e., clip_len=8, frame_interval=16
for train/val/test pipeline, which is the same as the paper mentioned.
adapt-image-models/configs/recognition/vit/vit_imagenet_k400.py
Lines 19 to 21 in 392647e
adapt-image-models/configs/recognition/vit/vit_imagenet_k400.py
Lines 32 to 39 in 392647e
adapt-image-models/configs/recognition/vit/vit_imagenet_k400.py
Lines 49 to 56 in 392647e
But, for CLIP pretrained, the configs are confused.
vitclip_base_k400
,clip_len=32, frame_interval=16
for train pipeline, whileclip_len=32, frame_interval=8
for val/test pipeline. However, ifclip_len=32
,frame_interval
should be 4?
adapt-image-models/configs/recognition/vit/vitclip_base_k400.py
Lines 19 to 21 in 392647e
adapt-image-models/configs/recognition/vit/vitclip_base_k400.py
Lines 32 to 39 in 392647e
adapt-image-models/configs/recognition/vit/vitclip_base_k400.py
Lines 49 to 56 in 392647e
vitclip_large_k400
,clip_len=16, frame_interval=16
for train/val/test pipeline. However, ifclip_len=16
,frame_interval
should be 8?
adapt-image-models/configs/recognition/vit/vitclip_large_k400.py
Lines 19 to 21 in 392647e
adapt-image-models/configs/recognition/vit/vitclip_large_k400.py
Lines 32 to 39 in 392647e
adapt-image-models/configs/recognition/vit/vitclip_large_k400.py
Lines 49 to 56 in 392647e
Thank you.
Hi @BinhuiXie , thanks for your interest in our work. You can safely follow the settings descripbed in the paper. I will update the codes.