This repository implements the boundaries head proposed in the paper:
Hanyuan Wang, Majid Mirmehdi, Dima Damen, Toby Perrett, Refining Action Boundaries for One-stage Detection, AVSS, 2022
This repository is based on ActionFormer.
When using this code, kindly reference:
@INPROCEEDINGS{HanyuanRefining2022,
author={Wang, Hanyuan and Mirmehdi, Majid and Damen, Dima and Perrett, Toby}
booktitle={The 18th IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS)},
title={Refining Action Boundaries for One-stage Detection},
year={2022}}
- Python 3.5+
- PyTorch 1.11
- CUDA 11.0+
- GCC 4.9+
- TensorBoard
- NumPy 1.11+
- PyYaml
- Pandas
- h5py
- joblib
Complie NMS code by:
cd ./libs/utils
python setup.py install --user
cd ../..
You can download the annotation repository of EPIC-KITCHENS-100 at here. Place it into a folder: ./data/epic_kitchens/annotations.
You can download the videos of EPIC-KITCHENS-100 at here.
You can download the feature on EPIC-KITCHENS-100 at here. Place it into a folder: ./data/epic_kitchens/features.
If everything goes well, you can get the folder architecture of ./data like this:
data
└── epic_kitchens
├── features
└── annotations
You can download our pretrained models on EPIC-KITCHENS-100 at here.
To train the model run:
python ./train.py ./configs/epic_slowfast.yaml --output reproduce --gau_sigma 5.5 --sigma1 0.5 --sigma2 0.5 --verb_cls_weight 0.5 --noun_cls_weight 0.5
To validate the model run:
python ./eval.py ./configs/epic_slowfast.yaml ./ckpt/epic_slowfast_reproduce/name_of_the_best_model --gau_sigma 5.5
[RESULTS] Action detection results_self.ap_action
|tIoU = 0.10: mAP = 19.19 (%)
|tIoU = 0.20: mAP = 18.61 (%)
|tIoU = 0.30: mAP = 17.47 (%)
|tIoU = 0.40: mAP = 16.30 (%)
|tIoU = 0.50: mAP = 14.33 (%)
Avearge mAP: 17.18 (%)
[RESULTS] Action detection results_self.ap_noun
|tIoU = 0.10: mAP = 23.58 (%)
|tIoU = 0.20: mAP = 22.40 (%)
|tIoU = 0.30: mAP = 21.03 (%)
|tIoU = 0.40: mAP = 19.27 (%)
|tIoU = 0.50: mAP = 16.39 (%)
Avearge mAP: 20.53 (%)
[RESULTS] Action detection results_self.ap_verb
|tIoU = 0.10: mAP = 23.75 (%)
|tIoU = 0.20: mAP = 22.68 (%)
|tIoU = 0.30: mAP = 21.22 (%)
|tIoU = 0.40: mAP = 19.19 (%)
|tIoU = 0.50: mAP = 16.73 (%)
Avearge mAP: 20.71 (%)
This implementation is based on ActionFormer.
Our main contribution is in:
./libs/modeling/meta_archs.:
* We incorporate the estimation of boundary confidence into prediction heads in both training and inference.
* We merged the classification heads of verb and noun, so the model can predict results for action task.
* We implemented label assignment for boundary confidence.
./libs/modeling/losses.py:
* We added the supervision of boundary confidence, including confidence scaling and loss function calculation.
libs/datasets/epic_kitchens.py:
* We Loaded data for noun and verb together.
/libs/utils/nms.py:
* We changed the sort of NMS through action socres instead of separate verb/noun scores.