Multi-Modality Self-Distillation for Weakly Supervised Temporal Action Localization
Linjiang Huang (CUHK), Liang Wang (CASIA), Hongsheng Li (CUHK)Paper: TIP
We propose a pseudo-label-based methods by taking full advantages of multiple modalities, i.e., RGB and optical flow sequences, to generate high quality pseudo labels. The experimental results on THUMOS14 are as below.
Method \ mAP(%) | @0.1 | @0.2 | @0.3 | @0.4 | @0.5 | @0.6 | @0.7 | AVG |
---|---|---|---|---|---|---|---|---|
UntrimmedNet | 44.4 | 37.7 | 28.2 | 21.1 | 13.7 | - | - | - |
STPN | 52.0 | 44.7 | 35.5 | 25.8 | 16.9 | 9.9 | 4.3 | 27.0 |
W-TALC | 55.2 | 49.6 | 40.1 | 31.1 | 22.8 | - | 7.6 | - |
AutoLoc | - | - | 35.8 | 29.0 | 21.2 | 13.4 | 5.8 | - |
CleanNet | - | - | 37.0 | 30.9 | 23.9 | 13.9 | 7.1 | - |
MAAN | 59.8 | 50.8 | 41.1 | 30.6 | 20.3 | 12.0 | 6.9 | 31.6 |
CMCS | 57.4 | 50.8 | 41.2 | 32.1 | 23.1 | 15.0 | 7.0 | 32.4 |
BM | 60.4 | 56.0 | 46.6 | 37.5 | 26.8 | 17.6 | 9.0 | 36.3 |
RPN | 62.3 | 57.0 | 48.2 | 37.2 | 27.9 | 16.7 | 8.1 | 36.8 |
DGAM | 60.0 | 54.2 | 46.8 | 38.2 | 28.8 | 19.8 | 11.4 | 37.0 |
TSCN | 63.4 | 57.6 | 47.8 | 37.7 | 28.7 | 19.4 | 10.2 | 37.8 |
EM-MIL | 59.1 | 52.7 | 45.5 | 36.8 | 30.5 | 22.7 | 16.4 | 37.7 |
BaS-Net | 58.2 | 52.3 | 44.6 | 36.0 | 27.0 | 18.6 | 10.4 | 35.3 |
A2CL-PT | 61.2 | 56.1 | 48.1 | 39.0 | 30.1 | 19.2 | 10.6 | 37.8 |
ACM-BANet | 64.6 | 57.7 | 48.9 | 40.9 | 32.3 | 21.9 | 13.5 | 39.9 |
HAM-Net | 65.4 | 59.0 | 50.3 | 41.1 | 31.0 | 20.7 | 11.1 | 39.8 |
ACSNet | - | - | 51.4 | 42.7 | 32.4 | 22.0 | 11.7 | - |
WUM | 67.5 | 61.2 | 52.3 | 43.4 | 33.7 | 22.9 | 12.1 | 41.9 |
AUMN | 66.2 | 61.9 | 54.9 | 44.4 | 33.3 | 20.5 | 9.0 | 41.5 |
CoLA | 66.2 | 59.5 | 51.5 | 41.9 | 32.2 | 22.0 | 13.1 | 40.9 |
ASL | 67.0 | - | 51.8 | - | 31.1 | - | 11.4 | - |
MMSD (Ours) | 69.7 | 64.3 | 54.6 | 45.0 | 36.4 | 23.0 | 12.3 | 43.6 |
- Python 3.6
- Pytorch 1.2
- Tensorboard Logger
- CUDA 10.0
-
Prepare THUMOS'14 dataset.
- We recommend using features and annotations provided by this repo.
-
Place the features and annotations inside a
dataset/Thumos14reduced/
folder.
You can easily train the model by running the provided script.
-
Refer to
train_options.py
. Modify the argument ofdataset-root
to the path of yourdataset
folder. -
Run the command below.
$ python train_main.py --run-type 0 --model-id 1
Models are saved in ./ckpt/dataset_name/model_id/
The trained model can be found here. Please put it into ./ckpt/dataset_name/model_id/
.
- Run the command below.
$ python train_main.py --pretrained --run-type 1 --model-id 1 --load-epoch 240
load-epoch
refers to the epoch of the best model. The best model would not always occur at 240 epoch, please refer to the log in the same folder of saved models to set the load epoch of the best model.
Make sure you set the right model-id
that corresponds to the model-id
during training.
We referenced the repos below for the code.
If you have any question or comment, please contact the first author of the paper - Linjiang Huang (ljhuang524@gmail.com).