/interpGaze

[ACM MM 2020] Code and Data For Controllable Continuous Gaze Redirection.

Primary LanguagePythonMIT LicenseMIT

Controllable Continuous Gaze Redirection

Paper License: MIT Python

Authors: Weihao Xia, Yujiu Yang, Jing-Hao Xue, and Wensen Feng

Contact: weihaox AT outlook.com

In this work, we present a novel method that works on both precise redirection and continuous interpolation. With the well-disentangled and hierarchically-organized latent space, we can adjust the order and strength of each attribute by altering the additional control vector. Furthermore, we contribute a high-quality gaze dataset, which contains a large range of gaze directions, to benefit researchers in related areas.

interpGaze

This is a reproduction and may be slightly different from the original implementaion. If you found any problems including codes and processed data, please feel free to pull requests or open issues.

Dependencies (click to expand)

Dependencies

pytorch == 1.7
numpy == 1.13.1
scipy == 0.19.1
matplotlib==2.2.4
pandas==0.24.2
imageio==2.5.0
requests==2.21.0
tqdm==4.31.1
numpy==1.16.3
scipy==1.2.1
colored==1.3.93
opencv_python==4.1.0.25
dlib==19.17.0
Pillow==6.2.2
tensorboardX==1.6
PyYAML==5.1.1

Dataset

We contribute a high-quality gaze dataset, which contains a large range of gaze directions and diversity on eye shapes, glasses, ages and genders, to benefit researchers in related areas. Samples are shown in the following pictures (more can be found at link).

We will release the dataset in the future. We have released the processed gaze patches on Google Drive.

The comparison of some popular gaze datasets is shown in the following table.

For fair comparison, we also train our model with the same dataset as in He et. al., which contains eye patch images parsed from Columbia Gaze Dataset. The dataset contains six subfolders, N30P/, N15P/, 0P/, P15P/, P30P/ and all/. Prefix 'N' means negative head pose, and 'P' means positive head pose. Folder all/ contains all eye patch images with different head poses.

You can directly download the dataset processed by HzDmS via this link or use our provided scripts in the repo or on Colab, which is in keeping with their paper.

"We first run face alignment with dlib by parsing the face with 68 facial landmark points. After that, a minimal enclosed circle with center(x, y) and radius R was extracted from the 6 landmark points of each eye. The cropping region of the eye patch is set as a square box with center (x, y) and side length 3.4R. We flipped the right eye images horizontally to align with the left eye images. All eye patch images were resized to 64 × 64."

Baselines

He et al.

Photo-Realistic Monocular Gaze Redirection Using Generative Adversarial Networks proposed by He et al.. ICCV 2019.

To train:

python main.py --mode train --data_path ./dataset/all/ --log_dir ./log/ --batch_size 32 --vgg_path ./vgg_16.ckpt

To test:

python main.py --mode eval --data_path ./dataset/0P/ --log_dir ./log/ --batch_size 21

Then, a folder named eval will be generated in folder ./log/. Generated images, input images and target images will be stored in eval/.

Deepwarp

The official DeepWarp Demo Page is currently on maintainace. Thus we use a Tensorflow re-implement of DeepWarp by BlueWinters.

We provide the dataset, pretrained model of He et.al. and Deepwarp at BaiduYun. Please contact to acquire the password.

Training

Download the dataset and checkpoints, extract at ./dataset and ./checkpoints, then run

python src/run.py train --data_dir dataset/all  -sp checkpoints/Gaze -bs 128 -gpu 0,1,2,3 --save_dir 'checkpoints/Gaze/'

Evaluation

For interpolation

CUDA_VISIBLE_DEVICES=7 python3 src/run.py test_selected_curve -mp checkpoints/Gaze -sp results/interpolation

For redirection

CUDA_VISIBLE_DEVICES=7 python3 src/run.py attribute_manipulation -mp checkpoints/Gaze -sp results/redirection  --filter_target_attr 0P -s 1 --branch_idx 0 --n_ref 1 -bs 1

NOTE: This is a reproduction (the original implementation was lost after my graduation) and may be slightly different from the results reported in the paper. Part of this code is still subject to a few more cleaning updates -- I am still wrapping up sanity checks after refactoring and fixing potential bugs. I will do some verification and extensively refactoring within the next few weeks.

Experiments

This picture is illustration of interpolation between two given samples (green and blue). It can be seen that other attributes like eyebrow, glass, hair and skin color are well-preserved in the redirected gaze images, which means our model works consistently well in generating person-specific gaze images. Furthermore, since the encoder actually unfolds the natural image manifold, leading to a flat and smooth latent space that allows interpolation and even extrapolation, as shown in the last column.

Citation

If you find our work, code or pre-trained models helpful for your research, please consider to cite:

@inproceedings{xia2020gaze,
  title={Controllable Continuous Gaze Redirection},
  author={Xia, Weihao and Yang, Yujiu and Xue, Jing-Hao and Feng, Wensen},
  year={2020},
  booktitle={ACM MM},
}