/pytorch_vsum-ptr-gan

A PyTorch implementation of VSumPtrGAN

Primary LanguageJupyter Notebook

[WACV'19 (oral)] Attentive and Adversarial Learning for Video Summarization

A PyTorch implementation of VSumPtrGAN

Project | Paper | Youtube

Overview

VSumPtrGAN is an implementation of
"Attentive and Adversarial Learning for Video Summarization"
Tsu-Jui Fu, Shao-Heng Tai, and Hwann-Tzong Chen
in IEEE Winter Conference on Applications of Computer Vision (WACV) 2019 (oral)

VSumPtrGAN a GAN-based training framework, which combines the merits of unsupervised and supervised video summarization approaches. The generator is an attention-aware Ptr-Net that generates the cutting points of summarization fragments. The discriminator is a 3D CNN classifier to judge whether a fragment is from a ground-truth or a generated summarization. Our Ptr-Net generator can overcome the unbalanced training-test length in the seq2seq problem, and our discriminator is effective in leveraging unpaired summarizations to achieve better performance.

Requirements

This code is implemented under Python3 and PyTorch.
Following libraries are also required:

Usage

  • VisualExtractor
Dataset/model_visual-extractor.ipynb
  • VSumPtrGAN
model_vsum-ptr-gan.ipynb

Resources

Citation

@inproceedings{fu2019vsum-ptr-gan, 
  author = {Tsu-Jui Fu and Shao-Heng Tai and Hwann-Tzong Chen}, 
  title = {Attentive and Adversarial Learning for Video Summarization}, 
  booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV)}, 
  year = {2019} 
}

Acknowledgement