This repository is the official implementation of SAC. In this work, we tackle the temporal action localization task from the perspective of modality, and precisely assign frame-modality attention. Paper from arXiv or IEEE.
To install requirements:
conda env create -n env_name -f environment.yaml
Before running the code, please activate this conda environment.
a. Download pre-extracted features from baiduyun (code:6666)
Please ensure the data structure is as below
├── data
└── thumos
└── val
├── video_validation_0000051_02432.npz
├── video_validation_0000051_02560.npz
├── ...
└── test
├── video_test_0000004_00000.npz
├── video_test_0000004_00256.npz
├── ...
a. Config
Adjust configurations.
./experiments/thumos/network.yaml
c. Train
cd tools
bash run.sh
a. You can download pre-trained models from baiduyun (code:6666), and put the weight file in the folder checkpoint
.
- Performance
0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | Average | |
---|---|---|---|---|---|---|---|---|---|---|
mAP | 75.54 | 73.65 | 69.09 | 61.06 | 51.44 | 37.10 | 22.75 | 8.63 | 1.43 | 44.52 |
b. Test
cd tools
python eval.py
- BackTAL: Background-Click Supervision for Temporal Action Localization.
- A2Net: Revisiting Anchor Mechanisms for Temporal Action Localization.
For any discussions, please contact nwpuyangle@gmail.com.