/SemanticPalette

Semantic Palette: Guiding Scene Generation with Class Proportions

Primary LanguagePythonOtherNOASSERTION

Semantic Palette - Official PyTorch Implementation

Semantic Palette: Guiding Scene Generation with Class Proportions
Guillaume Le Moing, Tuan-Hung Vu, Himalaya Jain, Patrick Pérez, Matthieu Cord
In CVPR 2021.

Paper: https://arxiv.org/abs/2106.01629
Video: https://youtu.be/ejkbaJD4Emk

Abstract: Despite the recent progress of generative adversarial networks (GANs) at synthesizing photo-realistic images, producing complex urban scenes remains a challenging problem. Previous works break down scene generation into two consecutive phases: unconditional semantic layout synthesis and image synthesis conditioned on layouts. In this work, we propose to condition layout generation as well for higher semantic control: given a vector of class proportions, we generate layouts with matching composition. To this end, we introduce a conditional framework with novel architecture designs and learning objectives, which effectively accommodates class proportions to guide the scene generation process. The proposed architecture also allows partial layout editing with interesting applications. Thanks to the semantic control, we can produce layouts close to the real distribution, helping enhance the whole scene generation process. On different metrics and urban scene benchmarks, our models outperform existing baselines. Moreover, we demonstrate the merit of our approach for data augmentation: semantic segmenters trained on real layout-image pairs along with additional ones generated by our approach outperform models only trained on real pairs.

Installation

Clone this repository:

git clone https://github.com/valeoai/SemanticPalette.git
cd SemanticPalette/

This code requires PyTorch 1.1+ and python 3+. To install dependencies run:

pip install -r requirements.txt

To install apex run:

cd tools
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
cd ../..

To install inplace_abn run:

cd models/segmentor
git clone https://github.com/mapillary/inplace_abn.git
cd inplace_abn
python setup.py install
cd ../../..

Download segmenter pretrained weights:

cd checkpoints
wget http://sceneparsing.csail.mit.edu/model/pretrained_resnet/resnet101-imagenet.pth
cd ..

Prepare datasets

Cityscapes

First, prepare the corresponding directory:

mkdir datasets/cityscapes

Download the data (i.e., images leftImg8bit_trainvaltest.zip and corresponding annotations gtFine_trainvaltest.zip) from the official database and put it in this new directory, then run:

unzip datasets/cityscapes/leftImg8bit_trainvaltest.zip -d datasets/cityscapes
unzip datasets/cityscapes/gtFine_trainvaltest.zip -d datasets/cityscapes

Cityscapes-25k

Download the extra images leftImg8bit_trainextra.zip into the datasets/cityscapes folder, then run:

unzip datasets/cityscapes/leftImg8bit_trainextra.zip -d datasets/cityscapes

In the same folder, download annotations estimated by state-of-the-art segmentation model from google drive, then run:

unzip datasets/cityscapes/refinement_final_v0.zip -d datasets/cityscapes
mkdir datasets/cityscapes/extraFine
mv datasets/cityscapes/refinement_final/train_extra/ datasets/cityscapes/extraFine/
rm -r datasets/cityscapes/extraFine/train_extra/tmp
python data/scripts/prepare_cityscapes_25k.py --data_root datasets/cityscapes/extraFine

Indian Driving Dataset (IDD)

First, prepare the corresponding directory:

mkdir datasets/idd

In this folder, download the data from the official database and use the AutoNUE codebase to preprocess the labels, then run:

python data/scripts/prepare_idd.py --data_root datasets/idd

CelebA

First, prepare the corresponding directory:

mkdir datasets/celeba

Download the data from here, then run:

python data/scripts/prepare_celeba.py --data_root datasets/celeba

Training

All scripts are given for Cityscapes dataset. Note that similar scripts can be obtained for Cityscapes-25k (adding --load_extra flag), as well as for IDD and CelebA (setting --dataset flag to idd and celeba respectively).

1 - Palette distribution estimation

Load semantic histograms from the training set and fit a Gaussian Mixture Model (GMM) whose number of components is set according to the Akaike information criterion.

bash scripts/train_palette_estimator.sh

2 - Layout synthesizer training

Before running the following command lines, one should change --estimator_load_path in the scripts to point to the checkpoint folder of the trained palette estimator from step 1.

  • Palette-conditioned layout synthesis
bash scripts/train_layout_synthesizer.sh
  • Palette-conditioned partial editing of layouts
bash scripts/train_partial_layout_synthesizer.sh

The layout synthesizer is trained from low to high resolution and checkpoints are saved for each resolution level. Stop training when you have reached the desired resolution.

3 - Image synthesizer training

Then, train the image synthesizer (GauGAN in our case):

bash scripts/train_image_synthesizer_128.sh

One can change --max_dim to train at another resolution level.

4 - End-to-end fine-tuning

Once again, one needs to update the paths corresponding to checkpoints from step 1, 2 and 3 in the scripts.

Fine-tune layout and image models end-to-end:

bash scripts/train_layout_image_synthesizer_128.sh
bash scripts/train_partial_layout_image_synthesizer_128.sh

Testing

5 - Segmenter training

Model testing includes evaluating the performance on a downstream task when using data synthesized by our approach. We train a segmentation model in different configurations: either on synthetic data, real data or the combination of both, with full or partial editing of layouts.

bash scripts/train_segmenter_real_128.sh
bash scripts/train_segmenter_synthetic_128.sh
bash scripts/train_segmenter_real_and_synthetic_128.sh
bash scripts/train_segmenter_real_and_partial_synthetic_128.sh

6 - Performance evaluation

To compute all evaluation metrics and save them to a csv file, run:

bash scripts/test_synthetic_128.sh

Demo

We provide checkpoints for various configurations:

Dataset Layout synthesis Image synthesis Resolution End-to-end Download
Cityscapes Partial GauGAN 128x256 checkpoint
Cityscapes Full GauGAN 128x256 checkpoint
Cityscapes-25k Full GauGAN 128x256 checkpoint
IDD Full GauGAN 128x256 checkpoint
CelebA Partial SEAN 256x256 checkpoint

Extract checkpoints with the following command (by replacing CKPT.zip with the corresponding name) before running the demo.

unzip CKPT.zip -d checkpoints/

7 - Partial editing on Cityscapes

bash scripts/demo_partial_editing_128.sh

8 - Palette interpolation on Cityscapes-25k

bash scripts/demo_palette_interpolation_128.sh

9 - Latent interpolation on IDD

bash scripts/demo_latent_interpolation_128.sh

10 - Face editing on CelebA

To automatically edit semantic attribute of faces while keeping their appearance we replace GauGAN by SEAN in our framework (by setting the flag --i_img_type to "style_generator"). We use a pretrained image model which can be downloaded from here.

By doing so, appearance can be propagated from a source image to the edited image. For new semantic content (which do not appear in the source image) we use a random style vector taken from the training dataset. Hence, we precompute style vectors by running:

bash scripts/extract_style.sh

Then, after updating the logs folder corresponding to style extraction, the following command performs face editing in different scenarios:

bash scripts/demo_face_editing_256.sh

Citation

If you find this code useful in your research, please consider citing:

@inproceedings{lemoing2021semanticpalette,
  title     = {Semantic Palette: Guiding Scene Generation with Class Proportions},
  author    = {Le Moing, Guillaume and Vu, Tuan-Hung and Jain, Himalaya and P{\'e}rez, Patrick and Cord, Mathieu},
  booktitle = {CVPR},
  year      = {2021}
}

Acknowledgments

This code borrows from GauGAN, SBGAN, SEAN and pytorch segmentation toolbox.

License

Semantic Palette is released under the Apache 2.0 license.