This is the official repository for the following paper:
Painterly Image Harmonization in Dual Domains [arXiv]
Junyan Cao, Yan Hong, Li Niu
Accepted by AAAI 2023.
Part of our PHDNet has been integrated into our image composition toolbox libcom https://github.com/bcmi/libcom. Welcome to visit and try \(^▽^)/
Painterly image harmonization aims to adjust the foreground style of the painterly composite image to make it compatible with the background. A painterly composite image contains a photographic foreground object and a painterly background image.
Our PHDNet is the first feed-forward painterly image harmonization method with released code.
When the background has dense textures or abstract style, our PHDiffusion can achieve better performance.
Sometimes setting the background style as the target style is not reasonable, this problem has been solved in our ArtoPIH.
Paniterly image harmonization requires two types of images: photographic image and painterly image. We cut a certain object from a photographic image by the corresponding instance mask, and then paste it onto a painterly image, generating a composite image.
We apply images from COCO to produce the foreground objects. For each image, We select the object with foreground ratio in [0.05, 0.3] and generate the forefround mask. The selected foreground masks are provided in Baidu Cloud (access code: ww1t) or OneDrive. The training set can be downloaded from COCO_train (alternative: Baidu Cloud (access code: nfsh), OneDrive) and the test set from COCO_test (alternative: Baidu Cloud (access code: nsvj), OneDrive).
We apply images from WikiArt to be the backgrounds. The dataset can be downloaded from Baidu Cloud (access code: sc0c) or OneDrive. The training/test data list are provided in wikiart_split or OneDrive.
The example dataset dirs:
your_dir
│
└───MS-COCO
│ └───SegmentationClass_select
│ │ │ XXX.png
│ │ │ ...
│ │
│ └───train2014
│ │ │ XXX.jpg
│ │ │ ...
│ │
│ └───val2014
│ │ XXX.jpg
│ │ ...
│
└───wikiart
└───WikiArt_Split
│ │ style_class.txt
│ │ style_train.csv
│ │ style_val.csv
│
└───unzipped_subfolders
- Linux
- Python 3
- PyTorch 1.10
- NVIDIA GPU + CUDA
- Clone this repo:
git clone https://github.com/bcmi/PHDNet-Painterly-Image-Harmonization.git
# cd to this repo's root dir
-
Prepare the datasets.
-
Install PyTorch and dependencies from http://pytorch.org.
-
Install python requirements:
pip install -r requirements.txt
- Download pre-trained VGG19 from Baidu Cloud (access code: pc9y) or OneDrive.
- Train PHDNet:
cd PHDNet/scripts
bash train_phd.sh
The trained model would be saved under ./<checkpoints_dir>/<name>/
.
If you want to load a model then continue to train it, add --continue_train
and set the --epoch XX
in train_phd.sh
. It would load the model ./<checkpoints_dir>/<name>/<epoch>_net_G.pth
.
For example, if the model is saved in ./AA/BB/latest_net_G.pth
, the checkpoints_dir
should be ../AA/
and the name
should be BB
. And the epoch
should be latest
.
Remember to modify the content_dir
and style_dir
to the corresponding path of each dataset in train_phd.sh
.
- Test PHDNet:
cd PHDNet/scripts
bash test_phd.sh
It would load the model ./<checkpoints_dir>/<name>/<epoch>_net_G.pth
then save the visualization under ./<checkpoints_dir>/<name>/web/TestImages/
Our pre-trained model is available on Baidu Cloud (access code: po7q) or OneDrive.
- Note:
<...>
means modifiable parameters.
Our method is especially good at handling the background paintings with periodic textures and patterns, because we leverage the information from both spatial domain and frequency domain.