/STVR

Primary LanguagePythonMIT LicenseMIT

Spatio-temporal Video Re-localization by Warp LSTM

by Yang Feng, Lin Ma, Wei Liu, and Jiebo Luo

Introduction

We formulate a new task named spatio-temporal video re-localization. Given a query video and a reference video, spatio-temporal video re-localization aims to localize tubelets in the reference video such that the tubelets semantically correspond to the query. For more details, please refer to our paper.

alt text

Citation

@InProceedings{feng2019spatio,
  author = {Feng, Yang and Ma, Lin and Liu, Wei and Luo, Jiebo},
  title = {Spatio-temporal Video Re-localization by Warp LSTM},
  booktitle = {CVPR},
  year = {2019}
}

Requirements

pip install tensorflow-gpu
sudo apt install python-opencv

In case you are only interested in the proposed Warp LSTM, please find the implementation in the link.

Dataset.

  1. Generate the dataset for STVR.
    python gen_subsets.py
    

Install Tensorflow Object Detection API.

  1. mkdir ~/workspace
    cd ~/workspace
    git clone https://github.com/tensorflow/models.git
    cd models
    git remote add yang https://github.com/fengyang0317/tf_models.git
    git fetch yang
    git checkout warp_c
    export PYTHONPATH=${HOME}/workspace/models/research/object_detection:\
    ${HOME}/workspace/models/research:\
    ${HOME}/workspace/models/research/slim
    

Then follow the instructions in Installation

Extract video clips.

  1. We cut the videos to one-second clip for loading into Tensorflow.
    cd ~/workspace
    git clone https://github.com/fengyang0317/STVR.git
    cd STVR
    python split_videos.py --data_dir PATH_TO_VIDEOS --subset train
    python split_videos.py --data_dir PATH_TO_VIDEOS --subset val
    

Training.

  1. python main.py --data_dir PATH_TO_VIDEOS --batch_size 8 --i3d_ckpt CKPT_PATH
    

Evaluation.

  1. python eval.py --data_dir PATH_TO_VIDEOS --i3d_ckpt CKPT_PATH
    python compute_ap.py
    

Credits

Part of the code is from kinetics-i3d, ActivityNet, and Tensorflow Object Detection API.