/SVIP-Smooth-DTW

Sequence Verification for Procedures in Videos With Smooth DTW Loss

Primary LanguagePythonMIT LicenseMIT

SVIP-Smooth-DTW: Sequence VerIfication for Procedures in Videos with Smooth DTW

This repo is a experimental combination of SVIP:https://github.com/svip-lab/SVIP-Sequence-VerIfication-for-Procedures-in-Videos and VideoAlignment: https://github.com/hadjisma/VideoAlignment.

Main pipeline uses SVIP so the setup and scripts are copied from there. Smooth DTW loss defined in utils/smoothDTW.py. Training pipeline is modified to only use Smooth DTW loss. Some example figures in figs. dist_matrix_*.png corresponds to distance matrix from smooth DTW. dtw_matrix_*.png corresponds to DTW matrix computed through DP. frames_*.png corresponds to the frame input pairing including labels.


Getting Started

Prerequisites

  • python 3.6
  • pytorch 1.7.1
  • cuda 10.2

Installation

  1. Clone the repo and install dependencies.

    git clone https://github.com/svip-lab/SVIP-Sequence-VerIfication-for-Procedures-in-Videos.git
    cd VIP-Sequence-VerIfication-for-Procedures-in-Videos
    pip install requirements.txt 
  2. Download the pretrained model.

    Link:here

    Extraction code:2555


Datasets

Please refer to here for detailed instructions.


Training and Evaluation

We have provided the default configuration files for reproducing our results. Try these commands to play with this project.

  • For training:
    CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --config configs/train_resnet_config.yml
  • For evaluation:
    CUDA_VISIBLE_DEVICES=0 python eval.py --config configs/eval_resnet_config.yml --root_path [model&log folder] --dist [L2/NormL2] --log_name [xxx]
    Note that we use L2 distance while evaluating on COIN-SV, otherwise NormL2.

Trained Models

We provide checkpoints for each dataset trained with this re-organized codebase.

Notice: The reproduced performances are occassionally higher or lower (within a reasonable range) than the results reported in the paper.

DatasetSplitPaparReproduceckpt
COIN-SV val 56.81, 0.400558.27, 0.4667here
test51.13, 0.409851.55, 0.4658
Diving48-SV val 91.91, 1.064291.69, 1.0928here
test83.11, 0.600984.28, 0.6193
CSV test 83.02, 0.419382.88, 0.4474here

Citation

If you find this repo helpful, please cite our paper:

@inproceedings{qian2022svip,
  title={SVIP: Sequence VerIfication for Procedures in Videos},
  author={Qian, Yicheng and Luo, Weixin and Lian, Dongze and Tang, Xu and Zhao, Peilin and Gao, Shenghua},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={19890--19902},
  year={2022}
}