/Video-Summarization-with-LSTM

Implementation of our ECCV 2016 Paper (Video Summarization with Long Short-term Memory)

Primary LanguageMATLABOtherNOASSERTION

Video Summarization with LSTM

This repository provides the data and implementation for video summarization with LSTM, i.e. vsLSTM and dppLSTM in our paper:

Video Summarization with Long Short-term Memory
Ke Zhang*, Wei-Lun Chao*, Fei Sha, and Kristen Grauman.
In Proceedings of the European Conference on Computer Vision (ECCV), 2016, Amsterdam, The Netherlands. (*Equal contribution) [pdf] [supp]

If you find the codes or other related resources from this repository useful, please cite the following paper:

@inproceedings{zhang2016video,
  title={Video summarization with long short-term memory},
  author={Zhang, Ke and Chao, Wei-Lun and Sha, Fei and Grauman, Kristen},
  booktitle={ECCV},
  year={2016},
  organization={Springer}
}

Environment

  • MAC OS X or Linux
  • NVIDIA GPU with compute capability 3.5+
  • Python 2.7+
  • Theano 0.7+
  • Matlab

Data

Download the data and unzip to ./data/

Note that we down-sampled the original video by 2fps.

  1. file name: in the format 'Data_$Dataset$_google_p5.h5', e.g. Data_SumMe_google_p5.h5, means the frame level feature of SumMe dataset.
  2. the index of videos are stored as ‘idx’ in the file, in most cases it’s from 1 to n, where n is the number of videos in the dataset (except for Youtube dataset).
  3. feature & ground-truth: the feature is indexed as ‘fea_i’ , the importance is indexed as ‘gt_1_i’ (real number, from the original dataset), and the keyframe we used is indexed as ‘gt_2_i’ (binary value transferred from the original dataset) for the i-th video in the dataset.

Original videos and annotations for each dataset are also available from the the authors' project page

Codes

dppLSTM for video summarization

We have enclosed pre-trained models in the ./model directory download the model and run the following commands:

Download the pre-trained models and unzip it to ./models and run the following commands:

cd ./codes
THEANO_FLAGS=device=gpu0,floatX=float32 python dppLSTM_main.py 

This will automatically run summarization on the video data using pre-trained model, and save the results in ./res_LSTM/ as dppLSTM_$DATASET$_2_inference.h5

If you want to train the model on your own data, just uncomment Line 85 in dppLSTM_main.py

train(model_idx = model_idx, train_set = train_set, val_set = val_set, model_saved = model_file)

Evaluation

For both SumMe and TVSum datasets, you can find the code for evaluation provided by the author:

We also provided the evaluation code with wrappers that help adapt to the datasets above

To run evaluation on the predicted summarization, start the matlab and run the following commands:

cd ./codes
dppLSTM_eval('../data/', '$DATASET$', '/dppLSTM_$DATASET$_2_inference.h5')