This repo contains the reference source code for the paper:
https://arxiv.org/abs/2110.12358
Few-shot versions of Kinetics and Something-Something V2 datasets can be downloaded from here. We used the split from CMN for Kinetics and the split from OTAM for SSv2. If you already have the full versions of Kinetics and SSv2, you can also use ./tools/select_kinetics100.py
to select the few-shot verison datasets. Generate the annotation using ./tools/write_kinetics100.py
For classifier-based methods, we use the standard ResNet50 backbone and training strategies for video classification. Please refer to https://github.com/liu-zhy/temporal-adaptive-module for the feature extractor training. Note that we apply dropout for Baseline Plus and set 'consensus_type=avg' for both classifier-based methods.
For meta-learning methods, modify the corresponding code to have the correct path and filename for the dataset. To train the Meta-Baseline for example (see paper for other hyperparams), run:
CUDA_VISIBLE_DEVICES='0' python proto.py --work_dir [WORK_DIR] --dataset somethingotam
For classifier-based methods, modify ./config/test_baseline.yaml
and run:
CUDA_VISIBLE_DEVICES='0' python baseline_evaluate.py
For meta-learning methods, run:
CUDA_VISIBLE_DEVICES='0' python proto.py --test_model True --checkpoint [CHECKPOINT] --dataset somethingotam
Please refer to utils.py for additional options.
We have modified and integrated the following code into this project:
https://github.com/wyharveychen/CloserLookFewShot
https://github.com/liu-zhy/temporal-adaptive-module
https://github.com/wangzehui20/OTAM-Video-via-Temporal-Alignment