/domain-expansion

Domain Expansion of Image Generators - CVPR23

Primary LanguagePythonOtherNOASSERTION

Domain Expansion of Image Generators
Official Implementation

This repo contains training and synthesis code for domain-expanded models as well as pre-trained weights.

synthesis

Domain Expansion of Image Generators
Yotam Nitzan, Michaël Gharbi, Richard Zhang, Taesung Park, Jun-Yan Zhu, Daniel Cohen-Or, Eli Shechtman

Tel-Aviv University, Adobe Research, CMU

Abstract: Can one inject new concepts into an already trained generative model, while respecting its existing structure and knowledge? We propose a new task - domain expansion - to address this. Given a pretrained generator and novel (but related) domains, we expand the generator to jointly model all domains, old and new, harmoniously. First, we note the generator contains a meaningful, pretrained latent space. Is it possible to minimally perturb this hard-earned representation, while maximally representing the new domains? Interestingly, we find that the latent space offers unused, dormant directions, which do not affect the output. This provides an opportunity: By repurposing these directions, we can represent new domains without perturbing the original representation. In fact, we find that pretrained generators have the capacity to add several - even hundreds - of new domains! Using our expansion method, one expanded model can supersede numerous domain-specific models, without expanding the model size. Additionally, a single expanded generator natively supports smooth transitions between domains, as well as composition of domains.

Setup

Code was tested with Python 3.8.13, Pytorch 1.7.1 and CUDA 11.3 on Ubuntu 20.04.

This repository is built on top of stylegan2-ada-pytorch. You can follow their setup instructions and install our additional dependencies with:

pip install git+https://github.com/openai/CLIP.git
pip install wandb lpips

Altenatively, we provide an environment.yml file that can be used to create a Conda environment from scratch.

conda env create -f environment.yml

Inference

You can generate aligned images - i.e., the same latent code projected to various subspaces - using generate_aligned.py.
MyStyle operates slightly different since the effect of training is local, and hence a latent is often meaningless in different subspaces. To generate with MyStyle-repurposed subspace, you can use generate_mystyle.py.

For convenience, we provide a couple of pretrained NADA-expanded models:

Parent Model Number of new domains Model
StyleGAN2 FFHQ 100 Model
StyleGAN2-ADA AFHQ Dog 50 Model

Training

Training interface is similar to that in stylegan2-ada-pytorch, with a few additional arguments. A training command example is given here.

Parameter --expansion_cfg_file points to a JSON configuration file specifying the domain expansions to perform. Two examples, applying NADA and MyStyle, are in config_examples directory.

Here's NADA's example:

{
    "tasks": [
      {"type": "NADA", "dimension": 510, "args": {"source_text": "photo","target_text": "sketch"}},
      {"type": "NADA", "args": {"source_text": "person","target_text": "tolkein elf"}}
    ],
    "tasks_losses": {
        "NADA" : {
            "clip_models": ["ViT-B/32","ViT-B/16"],
            "clip_model_weights": [1.0, 1.0]
        }
      }
}

The first key, "tasks", defines the training task to perform on specific latent directions. If the dimension number is not specified, we use the "most dormant" direction that isn't already specified. In the above example, elf would repurpose the dim 511. Different tasks might perform the same adaptation method. We therefore specify shared arguments separately under "tasks_losses".

Since adaptation methods might require different number of steps, we recommend expanding the domain gradually. For example, repurpose several subspaces with MyStyle first. When results are satisfactory, repurpose more subspace with NADA.

Support Additional Domain Adaptation Methods

To extend this repository and support additional domain adaptation tasks, you only need to define a new class inheriting from BaseTask. Please consider sending us a pull request if you do!

BibTeX

@article{nitzan2023domain,
  title={Domain Expansion of Image Generators},
  author={Nitzan, Yotam and Gharbi, Micha{\"e}l and Zhang, Richard and Park, Taesung and Zhu, Jun-Yan and Cohen-Or, Daniel and Shechtman, Eli},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  year={2023}
}

Acknowledgments

Our code is built on top of stylegan2-ada-pytorch and borrows from StyleGAN-NADA and MyStyle.

Thanks to alvanlii for creating the HuggingFace Demo!