HIDA: Human-Inspired Facial Sketch Synthesis with Dynamic Adaptation

Abstract

Facial sketch synthesis (FSS) aims to generate a vivid sketch portrait from a given facial photo. Existing FSS methods merely rely on 2D representations of facial semantic or appearance. However, professional human artists usually use outlines or shadings to covey 3D geometry. Thus facial 3D geometry (e.g. depth map) is extremely important for FSS. Besides, different artists may use diverse drawing techniques and create multiple styles of sketches; but the style is globally consistent in a sketch. Inspired by such observations, in this paper, we propose a novel \textit{Human-Inspired Dynamic Adaptation} (HIDA) method. Specially, we propose to dynamically modulate neuron activations based on a joint consideration of both facial 3D geometry and 2D appearance, as well as globally consistent style control. Besides, we use deformable convolutions at coarse-scales to align deep features, for generating abstract and distinct outlines. Experiments show that HIDA can generate high-quality sketches in multiple styles, and significantly outperforms previous methods, over a large range of challenging faces. Besides, HIDA allows precise style control of the synthesized sketch, and generalizes well to natural scenes. Our code will be released after peer review.

Paper Information

Fei Gao, Yifan Zhu, Chang Jiang, Nannan Wang, Human-Inspired Facial Sketch Synthesis with Dynamic Adaptation, Proceedings of the International Conference on Computer Vision (ICCV), accepted, 2023.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{jiang2023masked,
  title={Human-Inspired Facial Sketch Synthesis with Dynamic Adaptation},
  author={Gao, Fei and Zhu, Yifan and Jiang, Chang and Wang, Nannan},
  booktitle={Proceedings of the International Conference on Computer Vision (ICCV)},
  pages={},
  year={2023}
}

Pipeline

Sample Results

Comparison with SOTAs on the FS2K dataset:

(a)Photo (b)Depth (c)Ours (d)Pix2PixHD (e)FSGAN (f)SCA-GAN (g)GT (h)Pix2Pix (i)MDAL (j)CycleGAN (k)GENRE

Performance on faces in-the-wild:

(a)Photo (b)Ours(Style1) (c)Ours(Style2) (d)Ours(Style3) (e)GENRE (f)Pix2Pix (g)CycleGAN (h)SCA-GAN

Performance of our DISC model on natural images:

(a)Photo (b)Depth (c)Ours(Style1) (d)Ours(Style2) (e)Ours(Style3)

Extension to Pen-drwings and Oilpaintings
More Results:

We offer more results here: https://drive.google.com/file/d/1vT0nqEVVByBW1QltYVX_mIYCcZ4wXsQD/view?usp=sharing

Prerequisites

Linux or macOS
Python 3.8.12
Pytorch-lightning 0.7.5
CPU or NVIDIA GPU + CUDA CuDNN

Getting Started

Installation

Clone this repo:

git clone https://github.com/AiArt-HDU/DISC
cd DISC

Install PyTorch 1.7.1 and torchvision from http://pytorch.org and other dependencies (e.g., visdom and dominate). You can install all the dependencies by
```
pip install -r requirements.txt
```
The installation environment of DCN-V2 depandency is more complicated，you can refer to the

Apply a pre-trained model

A face photo↦sketch model pre-trained on dataset FS2K
The pre-trained model need to be save at ./checkpoint
Then you can test the model

Train/Test

Download the dataset FS2K here

Train a model

python train.py --root your_root_path_train

Test the model: please prepare your test data's depth maps using 3DDFA methods

python test.py --data_dir your_data_path_test --depth_dir your_depth_path_test

If you want to train on your own data, please first align your pictures and prepare your data's depth maps according to tutorial in preprocessing steps.

Preprocessing steps

Face photos (and paired drawings) need to be aligned and have depth maps. And depth maps after alignment are needed in our code in training.

In our work, depth map is generated by method in [1]

First, we need to align, resize and crop face photos (and corresponding drawings) to 250x250
Then,we use code in 3DDFA

to generate depth maps for face photos and drawings.

[1] J. Guo, X. Zhu, Y. Yang, F. Yang, Z. Lei, and S. Z. Li, “Towards fast, accurate and stable 3d dense face alignment,” in Proceedings of the European Conference on Computer Vision (ECCV), 2020.

Citation

If you use this code for your research, please cite our paper.

Acknowledgments

Our code is inspired by pytorch-CycleGAN-and-pix2pix, GENRE, and CocosNet.

ludandandan/HIDA

HIDA: Human-Inspired Facial Sketch Synthesis with Dynamic Adaptation

Abstract

Paper Information

Citation

Pipeline

Sample Results

Prerequisites

Getting Started

Installation

Apply a pre-trained model

Train/Test

Preprocessing steps

Citation

Acknowledgments