[ECCV2022] MorphMLP [arxiv]

Our MorphMLP paper was accepted to ECCV 2022！！

We current release the code and models for:

Kintics-400
Something-Something V1
Something-Something V2
ImageNet-1K: For our models training/testing on ImageNet-1K, and how to transfer the pretrained weight for video usage, you can refer IMAGE.md.

Update

Aug,3rd 2022

[Initial commits]:

Pretrained models on Kinetics-400, Something-Something V1

Model Zoo

The ImageNet-1K pretrained models, followed models and logs can be downloaded on Google Drive: total_models.

We also release the models on Baidu Cloud: total_models (cjwu).

Note

All the models are pretrained on ImageNet-1K. You can find those pre-trained models in pretrained and put them in pretrained folder.
#Frame = #input_frame x #crop x #clip
#input_frame means how many frames are input for model per inference
#crop means spatial crops (e.g., 3 for left/right/center)
#clip means temporal clips (e.g., 4 means repeted sampling four clips with different start indices)

Kinetics-400

Model	#Frame	Sampling Stride	FLOPs	Top1	Model	Log	config
MorphMLP-S	16x1x4	4	268G	78.7	google	google	config
MorphMLP-S	32x1x4	4	532G	79.7	google	google	config
MorphMLP-B	16x1x4	4	392G	79.5	google	google	config
MorphMLP-B	32x1x4	4	788G	80.8	google	google	config

Something-Something V1

Model	Pretrain	#Frame	FLOPs	Top1	Model	Log	config
MorphMLP-S	IN-1K	16x1x1	67G	50.6	[soon]	[soon]	config
MorphMLP-S	IN-1K	16x3x1	201G	53.9	[soon]	[soon]	config
MorphMLP-B	IN-1K	16x3x1	294G	55.1	google	google	config
MorphMLP-B	IN-1K	32x3x1	591G	57.4	google	google	config

Something-Something V2

Model	Pretrain	#Frame	FLOPs	Top1	Model	Log	config
MorphMLP-S	IN-1K	16x3x1	201G	67.1	[soon]	[soon]	config
MorphMLP-S	IN-1K	32x3x1	405G	68.3	[soon]	[soon]	config
MorphMLP-B	IN-1K	16x3x1	294G	67.6	[soon]	[soon]	config
MorphMLP-B	IN-1K	32x3x1	591G	70.1	[soon]	[soon]	config

Usage

Installation

Please follow the installation instructions in INSTALL.md. You may follow the instructions in DATASET.md to prepare the datasets.

Training

Download the pretrained models into the pretrained folder.
Simply run the training code as followed:

python3 tools/run_net.py --cfg configs/K400/K400_MLP_S16x4.yaml DATA.PATH_PREFIX path_to_data OUTPUT_DIR your_save_path

[Note]:

You can change the configs files to determine which type of the experiments.
For more config details, you can read the comments in slowfast/config/defaults.py.
To avoid out of memory, you can use torch.utils.checkpoint (will be updated soon):

Testing

We provide testing example as followed:

Kinetics400

python3 tools/run_net.py --cfg configs/K400/K400_MLP_S16x4.yaml DATA.PATH_PREFIX path_to_data TRAIN.ENABLE False  TEST.NUM_ENSEMBLE_VIEWS 4 TEST.NUM_SPATIAL_CROPS 1 TEST.CHECKPOINT_FILE_PATH your_model_path OUTPUT_DIR your_output_dir

SomethingV1&V2

python3 tools/run_net.py   --cfg configs/SSV1/SSV1_MLP_B32.yaml DATA.PATH_PREFIX your_data_path TEST.NUM_ENSEMBLE_VIEWS 1 TEST.NUM_SPATIAL_CROPS 3 TEST.CHECKPOINT_FILE_PATH your_model_path OUTPUT_DIR your_output_dir

Specifically, we need to set the number of crops&clips and your checkpoint path then run multi-crop/multi-clip test:

Set the number of crops and clips:

Multi-clip testing for Kinetics

TEST.NUM_ENSEMBLE_VIEWS 4
TEST.NUM_SPATIAL_CROPS 1

Multi-crop testing for Something-Something

TEST.NUM_ENSEMBLE_VIEWS 1
TEST.NUM_SPATIAL_CROPS 3

You can also set the checkpoint path via:

TEST.CHECKPOINT_FILE_PATH your_model_path

Cite MorphMLP

If you find this repository useful, please use the following BibTeX entry for citation.

@article{zhang2021morphmlp,
  title={Morphmlp: A self-attention free, mlp-like backbone for image and video},
  author={Zhang, David Junhao and Li, Kunchang and Chen, Yunpeng and Wang, Yali and Chandra, Shashwat and Qiao, Yu and Liu, Luoqi and Shou, Mike Zheng},
  journal={arXiv preprint arXiv:2111.12527},
  year={2021}
}

Acknowledgement

This repository is built based on SlowFast and Uniformer repository.

HS991023/MorphMLP