HyperDomainNet: Universal Domain Adaptation for Generative Adversarial Networks (NeurIPS 2022)

Aibek Alanov*, Vadim Titov*, Dmitry Vetrov
*Equal contribution

Abstract: Domain adaptation framework of GANs has achieved great progress in recent years as a main successful approach of training contemporary GANs in the case of very limited training data. In this work, we significantly improve this framework by proposing an extremely compact parameter space for fine-tuning the generator. We introduce a novel domain-modulation technique that allows to optimize only 6 thousand-dimensional vector instead of 30 million weights of StyleGAN2 to adapt to a target domain. We apply this parameterization to the state-of-art domain adaptation methods and show that it has almost the same expressiveness as the full parameter space. Additionally, we propose a new regularization loss that considerably enhances the diversity of the fine-tuned generator. Inspired by the reduction in the size of the optimizing parameter space we consider the problem of multi-domain adaptation of GANs, i.e. setting when the same model can adapt to several domains depending on the input query. We propose the HyperDomainNet that is a hypernetwork that predicts our parameterization given the target domain. We empirically confirm that it can successfully learn a number of domains at once and may even generalize to unseen domains.

Description

The repository implements the domain-modulation technique and the HyperDomainNet from the paper "HyperDomainNet: Universal Domain Adaptation for Generative Adversarial Networks".

The domain-modulation technique allows to significantly reduce the number of training parameters required for the domain adaptation of the StyleGAN2 from 30 million weights to the only 6 thousand-dimensional domain vector. The overall idea of this mechanism is illustrated in the following diagram:

The technique is implemented for two types of adaptation setups:

text-driven single domain adaptation
image2image domain adaptation.

You can play with these setup in the colab notebook we set up for you:

Inspired by the reduction in the size of the optimizing parameter space we propose the HyperDomainNet that can predict our parameterization given the target domain. The following diagram demonstrates its structure and how it can be trained:

There are two setups for the HyperDomainNet:

HyperDomainNet for any textual description
HyperDomainNet for any given image (would be improved in future research).

You can also play with the HyperDomainNet in our colab notebook.

Description
Updates
Getting Started
- Notes
- Dependencies
Training
Inference
Editing
Evaluation
Related Works
Citation

Updates

15/10/2022 Initial version

Getting Started

For all the methods described in the paper, it is required to have:

Anaconda
PyTorch >=1.7.1
Packages from requirements.txt

Notes

Here, the code relies on the Rosinality pytorch implementation of StyleGAN2. Some parts of the StyleGAN implementation were modified, so that the whole implementation is native pytorch.

In addition to the requirements mentioned before, a pretrained StyleGAN2 generator will attempt to be downloaded with script download.py.

Dependencies

All base requirements could be installed via

conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=<CUDA_VERSION>
pip install -r requirements.txt

Model training

Here, we provide the code for the training.

In general trainind could be launched by following command

python main.py exp.config={config_name}

Config setup

Exp:

config_dir: configs
config: config_name.yaml
project: WandbProjectName
tags:
- tag1
- tag2
name: WandbRunName
seed: 0
root: ./
notes: empty notes
step_save: 20 – model dump frequency
trainer: trainer_name

Training:

iter_num: 400 – number of training iterations
batch_size: 4
device: cuda:0
generator: stylegan2
patch_key: cin_mult
phase: mapping – StyleGAN2 part which is fine-tuned, only used when patch_key = original
source_class: Photo – description of source domain
target_class: 3D Render in the Style of Pixar – description of target domain
auto_layer_k: 16
auto_layer_iters: 0 – number of iterations for adaptive corresponding stylegan2 layer freeze
auto_layer_batch: 8
mixing_noise: 0.9

Optimization_setup:

visual_encoders: – clip encoders that are used for clip based losses
- ViT-B/32
- ViT-B/16
loss_funcs:
- loss_name1
- loss_name2
loss_coefs:
- loss_coef1
- loss_coef2
g_reg_every: 4 – stylegan2 regularization coefficient (not recommended to change)
optimizer:
- weight_decay: 0.0
- lr: 0.01
- betas:
  - 0.9
  - 0.999

Logging:

log_every: 10 – loss logging step
log_images: 20 – images logging step
truncation: 0.7 – truncation during images logging
num_grid_outputs: 1 – number of logging grids

Checkpointing:

is_on: false
start_from: false
step_backup: 100000

Usage

When training ends model checkpoints could be found in local_logged_exps/. Each ckpt_name.pt could be inferenced using a helper classes Inferencer in core/utils/example_utils.

Pretrained model Inference

Here, we provide the code for using pretrained checkpoints for inference.

Setup

Pretrained models for various stylization are provided. Please refer to download.py and run it with flag --load_type=checkpoints inside root.

Additional notes

Downloaded checkpoints structure

    root/
        checkpoints/
            td_checkpoints/
                ...
            im2im_checkpoints/
                ...
            mapper_20_td.pt  
            mapper_large_resample_td.pt
            mapper_base_im2im.pt

Each model except mapper_base_im2im.pt could be inferenced with Inferencer, to infer mapper_base_im2im.pt Im2ImInferencer

Usage

Given a pretrained checkpoint for certain target domain, one can edit a given image This operation can be done through the examples/inference_playground.ipynb notebook

Code details

Core functions are

mixing_noise (latent code generation)
Inference (checkpoint processer)

Real image editing

Setup

Setup is same as in Inference section

Usage

Given a pretrained checkpoint for certain target domain, one can edit a given image Playground for editing could be found in examples/editing_playground.ipynb notebook and in google-colab:

Model evaluation

Here, we provide the code for evaluation based on clip metrics.

Setup

Before evaluation trained models needed to be got with one of two mentioned ways (trained/pretrained).

Usage

Given a pretrained checkpoint for certain target domain, one can be evaluated through the examples/evaluation.ipynb notebook

Related Works

Main idea is based on one-shot (text-drive, image2image) methods StyleGAN-NADA and MindTheGap.

To edit real images, we inverted them to the StyleGAN's latent space using ReStyle.

Citation

If you use this code for your research, please cite our paper:

@article{alanov2022hyperdomainnet,
  title={Hyperdomainnet: Universal domain adaptation for generative adversarial networks},
  author={Alanov, Aibek and Titov, Vadim and Vetrov, Dmitry P},
  journal={Advances in Neural Information Processing Systems},
  volume={35},
  pages={29414--29426},
  year={2022}
}

MACderRu/HyperDomainNet

HyperDomainNet: Universal Domain Adaptation for Generative Adversarial Networks (NeurIPS 2022)

Description

Table of Contents

Updates

Getting Started

Notes

Dependencies

Model training

Config setup

Exp:

Training:

Optimization_setup:

Logging:

Checkpointing:

Usage

Pretrained model Inference

Setup

Additional notes

Usage

Code details

Real image editing

Setup

Usage

Model evaluation

Setup

Usage

Related Works

Citation