About the performance gap with the released checkpoints

Question

About the performance gap with the released checkpoints

Opened this issue a year ago · 6 comments

yangbang18 commented a year ago

Thanks for your great work. I have two questions:

With the same kinetics400 validation set (19796 videos) as that of mmaction, the same setting as your configs/recognition/vit/vitclip_base_k400.py (32 x 3 x 1 Views during testing), the checkpoint vit_b_clip_32frame_k400.pth you provided, my evaluation results on kinetics400 validation set is 83.34 (acc@1) and 96.45 (acc@5), which is lower than your results given in README.md, i.e., 84.7 (acc@1) and 96.7 (acc@5). Is there any possible reason for the gap (e.g., do you have a smaller kinetics400 validation set due to expired links)?
The checkpoint vit_b_clip_32frame_diving48.pth you provided is tested on 32 x 1 x 1 Views, according to README.md. But the Views in configs/recognition/vit/vitclip_base_diving48.py is 32 x 1 x 3. My evaluation results is 88.43 (acc@1, 32 x1 x 3) and 88.32 (acc@1, 32 x 1 x 1), which is lower than your results given in README.md, i.e., 88.9 (acc@1, 32 x 1 x 1). Is there any possible reason for the gap?

I am also confused about the following mismatch:

The checkpoint vit_b_clip_32frame_k700.pth you provided is tested on 32 x 3 x 3 Views, according to README.md. But the Views in configs/recognition/vit/vitclip_base_k700.py is 8 x 3 x 3.

Answer 1 · 2023-09-13T08:35:26.000Z

With the same kinetics700 validation set (34824 videos) as that of mmaction, the checkpoint vit_b_clip_32frame_k700.pth you provided and 32 x 3 x 3 testing Views, my evaluation result on kinetics700 validation set is 75.78 (acc@1), which is lower than your result given in README.md, i.e., 76.9 (acc@1). Is there any possible reason for the gap?

Answer 2 · 2023-09-17T19:27:58.000Z

Hi @yangbang18 , thanks for your interest in our work.

We have 19404 validation videos. We are using the Kinetics-400 dataset from here.
It may be caused by the difference of environment and device.
The config is an example. You could modify the frames and frame_interval for different settings.
I don't have the access to the K700 dataset now. We were downloading the K700 dataset following this

Answer 3 · 2023-09-18T04:53:01.000Z

Sorry, I can't visit your kinetics400 link (even with VPN).

BTW, I have some new findings recently.

With the same kinetics400 validation set (19796 videos) as that of mmaction, I re-produce the training process at 8 V100s with configs/recognition/vit/vitclip_base_k400.py, which produces 83.36 (acc@1) and 96.41 (acc@5) under 32x3x1 views. These results are similar to the checkpoint vit_b_clip_32frame_k400.pth you provided.

With your acc@1 (84.9% according to the paper) reported on 19404 videos, the performance range of the model on my validation set (19796 videos) would be [(19404 * 84.9% + 392 * 0%) / 19796 = 83.2%, (19404 * 84.9% + 392 * 100%) / 19796 = 85.2%]

Given that my re-produced 83.36 is close to the lower bound (83.2), I suspect the missing 392 (19796 - 19404) videos in your validation set are hard for the model to classify.

About the claim: It may be caused by the difference of environment and device, I also had a try. I evaluated the released vit_b_clip_32frame_k400.pth checkpoint at V100 and 4090. Both devices gave the same results.

Answer 4 · 2023-09-18T05:21:31.000Z

Hi, the link is from academic torrent. The link is provided in MMAction2 . You may try other VPN. I will check the results on Diving48.

Answer 5 · 2023-09-18T05:26:33.000Z

I downloaded kinetics 400 at https://opendatalab.com/OpenMMLab/Kinetics-400, the same data as MMAction2 (i.e., the same number of training/validation videos). So did the kinetics 700.

I can reproduce Diving48 results by training. So you can overlook this part.

Answer 6 · 2023-09-29T17:29:42.000Z

I downloaded kinetics 400 at https://opendatalab.com/OpenMMLab/Kinetics-400, the same data as MMAction2 (i.e., the same number of training/validation videos). So did the kinetics 700.

I can reproduce Diving48 results by training. So you can overlook this part.

Hello, @yangbang18
I've been trying to reproduce Diving48 results by training recently. But I can't obtain the reported results.
Could you kindly provide your settings, configuration, or log?
Thank you.