FeatureFlow

A state-of-the-art Video Frame Interpolation Method using deep semantic flows blending.

FeatureFlow: Robust Video Interpolation via Structure-to-texture Generation (IEEE Conference on Computer Vision and Pattern Recognition 2020)

To Do List

Preprint
Training code

Requirements
Demos
Installation
Pre-trained Model
Download Results
Evaluation
Test your video
Training
Citation

Requirements

Ubuntu
PyTorch (>=1.1)
Cuda (>=10.0) & Cudnn (>=7.0)
mmdet 1.0rc (from https://github.com/open-mmlab/mmdetection.git)
visdom (not necessary)
NVIDIA GPU

Video demos

Click the picture to Download one of them or click Here(Google) or Here(Baidu)(key: oav2) to download 360p demos.

360p demos(including comparisons):

720p demos:

Installation

clone this repo
git clone https://github.com/open-mmlab/mmdetection.git
install mmdetection: please follow the guidence in its github

$ cd mmdetection
$ pip install -r requirements/build.txt
$ pip install "git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI"
$ pip install -v -e .  # or "python setup.py develop"
$ pip list | grep mmdet

Download test set

$ unzip vimeo_interp_test.zip
$ cd vimeo_interp_test
$ mkdir sequences
$ cp target/* sequences/ -r
$ cp input/* sequences/ -r

Download BDCN's pre-trained model:bdcn_pretrained_on_bsds500.pth to ./model/bdcn/final-model/

$ pip install scikit-image visdom tqdm prefetch-generator

Pre-trained Model

Google Drive

Baidu Cloud: ae4x

Place FeFlow.ckpt to ./checkpoints/.

Download Results

Google Drive

Baidu Cloud: pc0k

Evaluation

$ CUDA_VISIBLE_DEVICES=0 python eval_Vimeo90K.py --checkpoint ./checkpoints/FeFlow.ckpt --dataset_root ~/datasets/videos/vimeo_interp_test --visdom_env test --vimeo90k --imgpath ./results/

Test your video

$ CUDA_VISIBLE_DEVICES=0 python sequence_run.py --checkpoint checkpoints/FeFlow.ckpt --video_path ./yourvideo.mp4 --t_interp 4 --slow_motion

--t_interp sets frame multiples, only power of 2(2,4,8...) are supported. Use flag --slow_motion to slow down the video which maintains the original fps.

The output video will be saved as output.mp4 in your working diractory.

Training

Training Code train.py is available now. I can't run it for comfirmation now because I've left the Lab, but I'm sure it will work with right argument settings.

$ CUDA_VISIBLE_DEVICES=0,1 python train.py <arguments>

Please read the arguments' help carefully to fully control the two-step training.
Pay attention to the --GEN_DE which is the flag to set the model to Stage-I or Stage-II.
2 GPUs is necessary for training or the small batch_size will cause training process crash.
Deformable CNN is not stable enough so that you may face training crash sometimes(I didn't fix the random seed), but it can be detected soon after the beginning of running by visualizing results using Visdom.
Visdom visualization codes[line 75, 201-216 and 338-353] are included which is good for viewing training process and checking crash.

Citation

@InProceedings{Gui_2020_CVPR,
author = {Gui, Shurui and Wang, Chaoyue and Chen, Qihua and Tao, Dacheng},
title = {FeatureFlow: Robust Video Interpolation via Structure-to-Texture Generation},
booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

Contact

Shurui Gui; Chaoyue Wang

License

See MIT License

buildist/FeatureFlow