/AMD

[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models

Primary LanguagePython

[CVPR 2024] Official Implementation of AMD

flowchart The following ranking results are produced by vit-b:

PWC PWC PWC PWC PWC PWC

Asymmetric Masked Distillation for Pre-Training Small Foundation Models
Zhiyu Zhao, Bingkun Huang, Sen Xing, Gangshan Wu, Yu Qiao, and Limin Wang
Nanjing University, Shanghai AI Lab

News ๐Ÿ“ฐ

[2024.3.27] Code and models have been released!
[2024.2.29] Code and models will be released in the following days.
[2024.2.27] AMD is accpeted by CVPR2024! ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰

Main Results ๐Ÿš€

โœจ Something-Something V2

Method Extra Data Backbone Resolution #Frames x Clips x Crops Top-1 Top-5
AMD no ViT-S 224x224 16x2x3 70.2 92.5
AMD no ViT-B 224x224 16x2x3 73.3 94.0

โœจ Kinetics-400

Method Extra Data Backbone Resolution #Frames x Clips x Crops Top-1 Top-5
AMD no ViT-S 224x224 16x5x3 80.1 94.5
AMD no ViT-B 224x224 16x5x3 82.2 95.3

โœจ AVA 2.2

Method Extra Data Extra Label Backbone #Frame x Sample Rate mAP
AMD Kinetics-400 โœ— ViT-B 16x4 29.9
AMD Kinetics-400 โœ“ ViT-B 16x4 33.5

โœจ UCF101 & HMDB51

Method Extra Data Backbone UCF101 HMDB51
AMD Kinetics-400 ViT-B 97.1 79.6

โœจ ImageNet-1K

Method Extra Data Backbone Resolution Top-1
AMD no ViT-S 224x224 82.1
AMD no ViT-B 224x224 84.6

Installation ๐Ÿ”จ

Please follow the instructions in INSTALL.md.

Data Preparation โžก๏ธ

Please follow the instructions in DATASET.md for data preparation.

Pre-training ๐Ÿ”„

The pre-training instruction is in PRETRAIN.md.

Fine-tuning โคด๏ธ

The fine-tuning instruction is in FINETUNE.md.

Model Zoo ๐Ÿ“

We provide pre-trained and fine-tuned models in MODEL_ZOO.md.

Acknowledgements ๐Ÿ‘

This project is built upon VideoMAEv2 and MGMAE. Thanks to the contributors of these great codebases.

Citation โœ๏ธ

If you find this repository useful, please use the following BibTeX entry for citation.

@misc{zhao2023amd,
      title={Asymmetric Masked Distillation for Pre-Training Small Foundation Models}, 
      author={Zhiyu Zhao and Bingkun Huang and Sen Xing and Gangshan Wu and Yu Qiao and Limin Wang},
      year={2023},
      eprint={2311.03149},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}