/clic2020-devkit

Development Kit for the CLIC compression challenge.

Primary LanguagePython

Updated Version

❗️ Checkout the clic2021-devkit repo for the 2021 challenge ❗️

clic2020-devkit [Deprecated]

The following helps working with the P-frame challenge.

Downloading data

To download all files, run:

bash download.sh path/to/data

It will create a folder path/to/data and extract all frames there, into a structure like:

video1/
    video1_frame1_y.png
    video1_frame1_u.png
    video1_frame1_v.png
    video1_frame2_y.png
    video1_frame2_u.png
    video1_frame2_v.png
    ...
video2/
    video2_frame1_y.png
    video2_frame1_u.png
    video2_frame1_v.png
    ...

For this, one of gsutil, wget, or curl must be available. gsutil is probably the most efficient way.

To download only some videos, use --max_vides: bash download.sh path/to/data --max_videos 10

NOTE: The script first downloads all vidoes as .zip files, resulting in 250GB+ of data. Then all zips are decompressed one by one and subsequently deleted. If you interrupt the script while unpacking, and later re-run it, it will re-download those that were already unpacked. To prevent this at the expense of more hard-drive space used, you can keep the zip files by passing --no_delete_zip.

P-Frame Baseline

We implement a simple non-learned baseline in baseline_np.py. The algorithm can be described as follows:

ENCODE:
  Inputs: frame 1 (F1) and frame 2 (F2).
  Encode F2 given F1:
  1. Calculate Residual_Normalized = (F2 - F1) // 2 + 127
     (this is now in range {0, ..., 255}.
      The idea of the //2 is to have 256 possible values, because
      otherwise we would have 511 values.)
  2. Compress Residual_Normalized with JPG
  -> Save to Bitstream
DECODE:
  Inputs: F1 and Bitstream
  1. Get Residual_Normalized from JPG in Bistream
  2. F2' = F1 + ( Residual_normalized - 127 ) * 2
     (F2' is the reconstruction)

The run_baseline.sh script describes how this would be used to create a submission to the challenge server. Running it produces a decoder.zip and valout.zip which could be uploaded to the P-frame (validation) track at challenge.compression.cc. The reason run_baseline.sh compresses all decoder and data files using zip is to allow efficient uploads (some browsers hang if you try to upload 160000 files).

P-Frame Dataloading Helpers

We have data loaders for PyTorch and TensorFlow. By default, they yield pairs of frames, where each frame is represented as a tuple (Y, U, V). The dimensions of U and V are half the those of Y (420 format):

    [ ((Y1, U1, V1), (Y2, U2, V2))  # pair 1
      ((Y2, U2, V2), (Y3, U3, V3))  # pair 2
      ... ]

To get a single YUV tensor, we also provide a way to load merged YUV tensors (444 format):

    [ (YUV1, YUV2),
      (YUV2, YUV3),
      ...]

PyTorch Example

import pframe_dataset_tf as ds_torch

ds_420 = ds_torch.FrameSequenceDataset(data_root='data')
ds_444 = ds_torch.FrameSequenceDataset(data_root='data', merge_channels=True)

TensorFlow Example

Code tested in eager and graph mode, in TensorFlow 1.15. TODO Test in TensorFlow 2.0.

import pframe_dataset_tf as ds_tf

ds_420 = ds_tf.frame_sequence_dataset(data_root='data')
ds_444 = ds_tf.frame_sequence_dataset(data_root='data', merge_channels=True)