This contribution involves research and attempts to tackle the problem of verifying that the transcoded content itself is a reasonable match for the original source given a good-faith effort at transcoding.
The mission consists on developing a verification classifier that will give a pass / fail output (and with what confidence score), for a given segment of a given asset's rendition.
A series of articles on the topic can be found here and here.
This folder contains a Dockerfile to enable the interaction with a CLI for computing an asset's renditions Euclidean distance values. Further insight about how this works can be gained by interacting with the feature_engineering section and reading the aforementioned publications. Full documentation on the cli can be found on the cli folder of this repo here.
This repo contains several folders to separate different steps of the data generation and analysis.
We are using 10 second chunks of videos from the YouTube-8M Dataset available here. Previous work with this dataset can be found here.
All the information and the scripts to create the assets reside inside the YT8M_downloader folder and are explained in this document.
From the raw video dataset created we obtain different features out of the analysis made with different tools.
As part of the feature extraction, we want to generate different variations of the videos including different renditions, flipped videos, etc. Some of these variations constitute the bulk of what we label as "attacks". Other constitute "good" renditions where no distortions are included.
To obtain the different "attacks", we provide several scripts in order to perform each variation.
All the information and the scripts can be found inside the scripts folder here
Section 1 of Tools.ipynb notebook helps in the usage in case a notebook is preferred as a means of interaction.
There are different standard metrics (VMAF, MS-SSIM, SSIM and PSNR) provided by external tools (ffmpeg and libav) which can be run from the data-analysis/notebooks folder Tools.ipynb notebook. The notebook provides info on how to use them, but also inside the scripts folder here
Section 2 of Tools.ipynb notebook helps in the usage in case a notebook is preferred as a means of interaction.
Alternatively, the scripts can be run separately as bash scripts.
At this step we should have the required data in the form of video assets and attacks as well as the metrics extracted with the external tools which may be required by some of the notebooks.
Further information about this notebooks can be found here
Once models are trained and available, a CLI and a RESTful API to interact with them and obtain predictions are made available. The bash scripts launch_cli.sh and launch_api.sh can be run from the root folder of the project.
Several utility scripts are hosted in this folder for convenience. They are needed at different stages of the process and for different Docker instances.