/TecoGAN-PyTorch

A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

Primary LanguagePythonApache License 2.0Apache-2.0

TecoGAN-PyTorch

Introduction

This is a PyTorch reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution (VSR). Please refer to the official TensorFlow implementation TecoGAN-TensorFlow for more information.

Updates

  • Upgraded codebase, now support Multi-GPUs training & testing.

Features

  • Better Performance: This repo provides model with smaller size yet better performance than the official repo. See our Benchmark.
  • Multiple Degradations: This repo supports two types of degradation, i.e., BI & BD. Please refer to this wiki for more details about degradation types.
  • Unified Framework: This repo provides a unified framework for distortion-based and perception-based VSR methods.

Contents

  1. Dependencies
  2. Testing
  3. Training
  4. Benchmark
  5. License & Citation
  6. Acknowledgements

Dependencies

  • Ubuntu >= 16.04
  • NVIDIA GPU + CUDA
  • Python >= 3.7
  • PyTorch >= 1.4.0
  • Python packages: numpy, matplotlib, opencv-python, pyyaml, lmdb
  • (Optional) Matlab >= R2016b

Testing

Note: We apply different models according to the degradation type. The following steps are for 4x upsampling for BD degradation. You can switch to BI degradation by replacing all BD to BI below.

  1. Download the official Vid4 and ToS3 datasets.
bash ./scripts/download/download_datasets.sh BD 

You can manually download these datasets from Google Drive, and unzip them under ./data.

The dataset structure is shown as below.

data
  ├─ Vid4
    ├─ GT                # Ground-Truth (GT) video sequences
      └─ calendar
        ├─ 0001.png
        └─ ...
    ├─ Gaussian4xLR      # Low Resolution (LR) video sequences in BD degradation
      └─ calendar
        ├─ 0001.png
        └─ ...
    └─ Bicubic4xLR       # Low Resolution (LR) video sequences in BI degradation
      └─ calendar
        ├─ 0001.png
        └─ ...
  └─ ToS3
    ├─ GT
    ├─ Gaussian4xLR
    └─ Bicubic4xLR
  1. Download our pre-trained TecoGAN model.
bash ./scripts/download/download_models.sh BD TecoGAN

You can download the model from [BD degradation] or [BI degradation], and put it under ./pretrained_models.

  1. Upsample the LR videos by TecoGAN. The results will be saved at ./results. You can specify which model and how many gpus to use in test.sh.
bash ./test.sh BD TecoGAN
  1. Evaluate the upsampled results using the official metrics. These codes are borrowed from TecoGAN-TensorFlow, with minor modifications to adapt to the BI degradation.
python ./codes/official_metrics/evaluate.py -m TecoGAN_BD_iter500000
  1. Profile model (FLOPs, parameters and speed). You can modify the last argument to specify the size of the LR video.
bash ./profile.sh BD TecoGAN 3x134x320

Training

  1. Download the official training dataset according to the instructions in TecoGAN-TensorFlow, rename to VimeoTecoGAN, and place under ./data.

  2. Generate LMDB for GT data to accelerate IO. The LR counterpart will then be generated on the fly during training.

python ./scripts/create_lmdb.py --dataset VimeoTecoGAN --data_type GT

The following shows the dataset structure after finishing the above two steps.

data
  ├─ VimeoTecoGAN          # Original (raw) dataset
    ├─ scene_2000
      ├─ col_high_0000.png
      ├─ col_high_0001.png
      └─ ...
    ├─ scene_2001
      ├─ col_high_0000.png
      ├─ col_high_0001.png
      └─ ...
    └─ ...
  └─ VimeoTecoGAN.lmdb     # LMDB dataset
    ├─ data.mdb
    ├─ lock.mdb
    └─ meta_info.pkl       # each key has format: [vid]_[total_frame]x[h]x[w]_[i-th_frame]
  1. (Optional, this step is only required for BI degradation) Manually generate the LR sequences with Matlab's imresize function, and then create LMDB for them.
# Generate the raw LR video sequences. Results will be saved at ./data/Bicubic4xLR
matlab -nodesktop -nosplash -r "cd ./scripts; generate_lr_BI"

# Create LMDB for the raw LR video sequences
python ./scripts/create_lmdb.py --dataset VimeoTecoGAN --data_type Bicubic4xLR
  1. Train a FRVSR model first. FRVSR has the same generator as TecoGAN, but without perceptual training (i.e., GAN and perceptual losses). When the training is complete, copy and rename the last checkpoint weight from ./experiments_BD/FRVSR/001/train/ckpt/G_iter400000.pth to ./pretrained_models/FRVSR_BD_iter400000.pth. A pre-trained FRVSR model offers a better initialization for the following TecoGAN training.
bash ./train.sh BD FRVSR

You can download and use our pre-trained FRVSR model [BD degradation] [BI degradation] without training from scratch.

bash ./scripts/download/download_models.sh BD FRVSR
  1. Train a TecoGAN model. You can specify which gpus to use in train.sh. By default, the training is conducted in the background and the output info will be logged in ./experiments_BD/TecoGAN/001/train/train.log.
bash ./train.sh BD TecoGAN
  1. To monitor the training process and visualize the validation performance, run the following script.
python ./scripts/monitor_training.py --m TecoGAN -d Vid4

Note that the validation results are NOT exactly the same as the testing results mentioned above due to different implementation of the metrics. The differences are caused by croping policy, LPIPS version and some other issues.

Benchmark

[1] FLOPs & speed are computed on RGB sequence with resolution 134*320 on a single NVIDIA 1080Ti GPU.
[2] Both FRVSR & TecoGAN use 10 residual blocks, while TecoGAN+ has 16 residual blocks.

License & Citation

If you use this code for your research, please cite the following paper and our project.

@article{tecogan2020,
  title={Learning temporal coherence via self-supervision for GAN-based video generation},
  author={Chu, Mengyu and Xie, You and Mayer, Jonas and Leal-Taix{\'e}, Laura and Thuerey, Nils},
  journal={ACM Transactions on Graphics (TOG)},
  volume={39},
  number={4},
  pages={75--1},
  year={2020},
  publisher={ACM New York, NY, USA}
}
@misc{tecogan_pytorch,
  author={Deng, Jianing and Zhuo, Cheng},
  title={PyTorch Implementation of Temporally Coherent GAN (TecoGAN) for Video Super-Resolution},
  howpublished="\url{https://github.com/skycrapers/TecoGAN-PyTorch}",
  year={2020},
}

Acknowledgements

This code is built on TecoGAN-TensorFlow, BasicSR and LPIPS. We thank the authors for sharing their codes.

If you have any questions, feel free to email me jn.deng@foxmail.com