Hierarchical B-Frame Video Coding Using Two-Layer CANF without Motion Coding (TLZMC)
This repository contains codes of a novel two-layer system hierarchical B-frame coding architecture without motion coding based on two-layer Conditional Augmented Normalization Flows (CANF) for video compression. Unlike traditional compression systems, our approach does not transmit any motion information, which explores a new direction for learned video coding. The motion coding is replaced using low-resolution learning-based compressor and merging operations.
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023 [Paper] [Supplementary Material]
More detail: https://nycu-clab.github.io
- Evaluation
- Codes
- Checkpoints
- Requirements
How to run evaluation
-
Install torchac using
python setup.py build
andpython setup.py install
-
Install modules from requirements.txt
-
Run
python evaluate.py dataset_dir model_name checkpoint_name --group_gop n --gop m
-
In case
torch_compression
module is unavailable after install the torchac, drop the files here inside the evaluation folder.
Example:
-
HEVC-B:
python evaluate.py ./dataset/class_b tlzmc-plus ./tlzmc-plus-mse-2048.ckpt --group_gop 3 --gop 32
-
UVG:
python evaluate.py ./dataset/uvg tlzmc-plus ./tlzmc-plus-mse-2048.ckpt --group_gop 18 --gop 32
-
Evaluation results are stored in folder
./evaluation
There are three file examples to show demonstrate two-layer system:
- TLZMC+ (DS2 (MaxPool2D), SR-CARN, Frame Synthesis)
model_name : tlzmc-plus
- TLZMC** (DS2 (MaxPool2D), SR-CARN, Multi-Frame Merging Network)
model_name: tlzmc-double-star
- TLZMC* (DS4 (MaxPool2D), SR-Net, Multi-Frame Merging Network)
model_name: tlzmc-star
Checkpoints
TLZMC+ (DS2 (MaxPool2D), SR-CARN, Frame Synthesis, FTA)
TLZMC** (DS2 (MaxPool2D), SR-CARN, Multi-Frame Merging Network, FTA)
TLZMC* (DS4, SR-Net, Multi-Frame Merging Network, FTA)
Notes:
- The checkpoints are not final and can be subject to further fine-tuning.
- The CANF network are updated (less model size and computational complexity with comparable performance)
Evaluation Dataset
HEVC-B
UVG (Beauty, Bosphorus, HoneyBee, Jockey, ReadySetGo, ShakeNDry, YachtRide)
Citation
@InProceedings{Alexandre_2023_CVPR,
author = {Alexandre, David and Hang, Hsueh-Ming and Peng, Wen-Hsiao},
title = {Hierarchical B-Frame Video Coding Using Two-Layer CANF Without Motion Coding},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023},
pages = {10249-10258}
}