Hierarchical B-Frame Video Coding Using Two-Layer CANF without Motion Coding (TLZMC)

This repository contains codes of a novel two-layer system hierarchical B-frame coding architecture without motion coding based on two-layer Conditional Augmented Normalization Flows (CANF) for video compression. Unlike traditional compression systems, our approach does not transmit any motion information, which explores a new direction for learned video coding. The motion coding is replaced using low-resolution learning-based compressor and merging operations.

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023 [Paper] [Supplementary Material]

More detail: https://nycu-clab.github.io

  • Evaluation
    • Codes
    • Checkpoints
    • Requirements

How to run evaluation

  1. Install torchac using python setup.py build and python setup.py install

  2. Install modules from requirements.txt

  3. Run python evaluate.py dataset_dir model_name checkpoint_name --group_gop n --gop m

  4. In case torch_compression module is unavailable after install the torchac, drop the files here inside the evaluation folder.

Example:

  • HEVC-B: python evaluate.py ./dataset/class_b tlzmc-plus ./tlzmc-plus-mse-2048.ckpt --group_gop 3 --gop 32

  • UVG: python evaluate.py ./dataset/uvg tlzmc-plus ./tlzmc-plus-mse-2048.ckpt --group_gop 18 --gop 32

  • Evaluation results are stored in folder ./evaluation

There are three file examples to show demonstrate two-layer system:

  • TLZMC+ (DS2 (MaxPool2D), SR-CARN, Frame Synthesis) model_name : tlzmc-plus
  • TLZMC** (DS2 (MaxPool2D), SR-CARN, Multi-Frame Merging Network) model_name: tlzmc-double-star
  • TLZMC* (DS4 (MaxPool2D), SR-Net, Multi-Frame Merging Network) model_name: tlzmc-star

Checkpoints

TLZMC+ (DS2 (MaxPool2D), SR-CARN, Frame Synthesis, FTA)

TLZMC** (DS2 (MaxPool2D), SR-CARN, Multi-Frame Merging Network, FTA)

TLZMC* (DS4, SR-Net, Multi-Frame Merging Network, FTA)

Notes:

  • The checkpoints are not final and can be subject to further fine-tuning.
  • The CANF network are updated (less model size and computational complexity with comparable performance)

Evaluation Dataset

HEVC-B

Download

UVG (Beauty, Bosphorus, HoneyBee, Jockey, ReadySetGo, ShakeNDry, YachtRide)

Download

Citation

@InProceedings{Alexandre_2023_CVPR,
    author    = {Alexandre, David and Hang, Hsueh-Ming and Peng, Wen-Hsiao},
    title     = {Hierarchical B-Frame Video Coding Using Two-Layer CANF Without Motion Coding},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {10249-10258}
}