Video Preprocessing

This repository provides tools for preprocessing videos for TaiChi, VoxCeleb and UvaNemo dataset used in paper.

Dowloading videos and cropping according to precomputed bounding boxes

Instal requirments:

pip install -r requirements.txt

Load youtube-dl:

wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl
chmod a+rx youtube-dl

Run script to download videos, there are 2 formats that can be used for storing videos one is .mp4 and another is folder with .png images. While .png images occupy significantly more space, the format is loss-less and have better i/o performance when training.

Taichi

python load_videos.py --metadata taichi-metadata.csv --format .mp4 --out_folder taichi --workers 8

select number of workers based on number of cpu avaliable. Note .png format take aproximatly 80GB.

VoxCeleb

python load_videos.py --metadata vox-metadata.csv --format .mp4 --out_folder vox --workers 8

Note .png format take aproximatly 300GB.

UvaNemo Since videos is not avaliable on youtube you have to download videos from official website, and run:

python load_videos.py --metadata nemo-metadata.csv --format .mp4 --out_folder nemo --workers 8 --video_folder path/to/original/videos

Note .png format take aproximatly 18GB.

Preprocessing VoxCeleb dataset

If you need to change cropping strategy for VoxCeleb dataset or produce new bounding box annotations folow these steps:

Load vox-celeb1(vox-celeb2) annotations:

wget www.robots.ox.ac.uk/~vgg/data/voxceleb/data/vox1_test_txt.zip
unzip vox1_test_txt.zip

wget www.robots.ox.ac.uk/~vgg/data/voxceleb/data/vox1_dev_txt.zip
unzip vox1_dev_txt.zip

wget www.robots.ox.ac.uk/~vgg/data/voxceleb/data/vox2_test_txt.zip
unzip vox2_test_txt.zip

wget www.robots.ox.ac.uk/~vgg/data/voxceleb/data/vox2_dev_txt.zip
unzip vox2_dev_txt.zip

Load youtube-dl:

wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl
chmod a+rx youtube-dl

Install face-alignment library:

git clone https://github.com/1adrianb/face-alignment
cd face-alignment
pip install -r requirements.txt
python setup.py install

Install ffmpeg

sudo apt-get install ffmpeg

Run preprocessing (assuming 8 gpu, and 5 workers per gpu).

python crop_vox.py --workers 40 --device_ids 0,1,2,3,4,5,6,7 --format .mp4 --dataset_version 2

Preprocessing TaiChi dataset

If you need to change cropping strategy for TaiChi dataset or produce new bounding box annotations folow these steps:

Download videos based on annotations:

python load_videos.py --metadata taichi-metadata.csv --format .mp4 --out_folder taichi --workers 8 --video_folder youtube-taichi --no_crop

Install mask-rcnn benchmark. Follow the instalation guide https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/INSTALL.md
Load youtube-dl:

wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl
chmod a+rx youtube-dl

Run preprocessing (assuming 8 gpu, and 5 workers per gpu).

python crop_taichi.py --workers 40 --device_ids 0,1,2,3,4,5,6,7 --format .mp4

Preprocessing Nemo dataset

If you need to change cropping strategy for Nemo dataset or produce new bounding box annotations folow these steps:

Install face-alignment library:

git clone https://github.com/1adrianb/face-alignment
cd face-alignment
pip install -r requirements.txt
python setup.py install

Download videos from official website, and run:

python crop_nemo.py --in_folder /path/to/videos --out_folder nemo --device_ids 0,1 --workers 8 --format .mp4

Additional notes

Citation: