/MorphMLP

Primary LanguagePythonApache License 2.0Apache-2.0

[ECCV2022] MorphMLP [arxiv]

Our MorphMLP paper was accepted to ECCV 2022!!

We current release the code and models for:

  • Kintics-400
  • Something-Something V1
  • Something-Something V2
  • ImageNet-1K: For our models training/testing on ImageNet-1K, and how to transfer the pretrained weight for video usage, you can refer IMAGE.md.

Update

Aug,3rd 2022

[Initial commits]:

  1. Pretrained models on Kinetics-400, Something-Something V1

Model Zoo

The ImageNet-1K pretrained models, followed models and logs can be downloaded on Google Drive: total_models.

We also release the models on Baidu Cloud: total_models (cjwu).

Note

  • All the models are pretrained on ImageNet-1K. You can find those pre-trained models in pretrained and put them in pretrained folder.
  • #Frame = #input_frame x #crop x #clip
  • #input_frame means how many frames are input for model per inference
  • #crop means spatial crops (e.g., 3 for left/right/center)
  • #clip means temporal clips (e.g., 4 means repeted sampling four clips with different start indices)

Kinetics-400

Model #Frame Sampling Stride FLOPs Top1 Model Log config
MorphMLP-S 16x1x4 4 268G 78.7 google google config
MorphMLP-S 32x1x4 4 532G 79.7 google google config
MorphMLP-B 16x1x4 4 392G 79.5 google google config
MorphMLP-B 32x1x4 4 788G 80.8 google google config

Something-Something V1

Model Pretrain #Frame FLOPs Top1 Model Log config
MorphMLP-S IN-1K 16x1x1 67G 50.6 [soon] [soon] config
MorphMLP-S IN-1K 16x3x1 201G 53.9 [soon] [soon] config
MorphMLP-B IN-1K 16x3x1 294G 55.1 google google config
MorphMLP-B IN-1K 32x3x1 591G 57.4 google google config

Something-Something V2

Model Pretrain #Frame FLOPs Top1 Model Log config
MorphMLP-S IN-1K 16x3x1 201G 67.1 [soon] [soon] config
MorphMLP-S IN-1K 32x3x1 405G 68.3 [soon] [soon] config
MorphMLP-B IN-1K 16x3x1 294G 67.6 [soon] [soon] config
MorphMLP-B IN-1K 32x3x1 591G 70.1 [soon] [soon] config

Usage

Installation

Please follow the installation instructions in INSTALL.md. You may follow the instructions in DATASET.md to prepare the datasets.

Training

  1. Download the pretrained models into the pretrained folder.

  2. Simply run the training code as followed:

python3 tools/run_net.py --cfg configs/K400/K400_MLP_S16x4.yaml DATA.PATH_PREFIX path_to_data OUTPUT_DIR your_save_path

[Note]:

  • You can change the configs files to determine which type of the experiments.

  • For more config details, you can read the comments in slowfast/config/defaults.py.

  • To avoid out of memory, you can use torch.utils.checkpoint (will be updated soon):

Testing

We provide testing example as followed:

Kinetics400

python3 tools/run_net.py --cfg configs/K400/K400_MLP_S16x4.yaml DATA.PATH_PREFIX path_to_data TRAIN.ENABLE False  TEST.NUM_ENSEMBLE_VIEWS 4 TEST.NUM_SPATIAL_CROPS 1 TEST.CHECKPOINT_FILE_PATH your_model_path OUTPUT_DIR your_output_dir

SomethingV1&V2

python3 tools/run_net.py   --cfg configs/SSV1/SSV1_MLP_B32.yaml DATA.PATH_PREFIX your_data_path TEST.NUM_ENSEMBLE_VIEWS 1 TEST.NUM_SPATIAL_CROPS 3 TEST.CHECKPOINT_FILE_PATH your_model_path OUTPUT_DIR your_output_dir

Specifically, we need to set the number of crops&clips and your checkpoint path then run multi-crop/multi-clip test:

Set the number of crops and clips:

Multi-clip testing for Kinetics

TEST.NUM_ENSEMBLE_VIEWS 4
TEST.NUM_SPATIAL_CROPS 1

Multi-crop testing for Something-Something

TEST.NUM_ENSEMBLE_VIEWS 1
TEST.NUM_SPATIAL_CROPS 3

You can also set the checkpoint path via:

TEST.CHECKPOINT_FILE_PATH your_model_path

Cite MorphMLP

If you find this repository useful, please use the following BibTeX entry for citation.

@article{zhang2021morphmlp,
  title={Morphmlp: A self-attention free, mlp-like backbone for image and video},
  author={Zhang, David Junhao and Li, Kunchang and Chen, Yunpeng and Wang, Yali and Chandra, Shashwat and Qiao, Yu and Liu, Luoqi and Shou, Mike Zheng},
  journal={arXiv preprint arXiv:2111.12527},
  year={2021}
}

Acknowledgement

This repository is built based on SlowFast and Uniformer repository.