/composer

Official implementation of "Composer: Creative and Controllable Image Synthesis with Composable Conditions"

MIT LicenseMIT

Composer

Official repo for Composer: Creative and Controllable Image Synthesis with Composable Conditions.

See Project Page for more examples.

Concept of Composer

Composer is a large (5 billion parameters) controllable diffusion model trained on billions of (text, image) pairs. It can exponentially expand the control space through composition, leading to an enormous number of ways to generate and manipulate images, i.e., making the infinite use of finite means.

TODO

  • Release training and inference code.
  • Release pretrained models.
  • Release Gradio UI.
  • A light-weighted Latent-Composer built upon Stable Diffusion 2.1.

Composition Results

Composition of text and depth.

Text+Depth

Composition of masked image and text.

Masking+Text

Composition of sketch, depth and embedding (1).

Sketch,Depth+Embedding

Composition of sketch, depth and embedding (2).

Sketch,Depth+Embedding

Composition of text and palette.

Text+Palette

Composition of embedding and palette.

Embedding+Palette

Composition of intensity and palette.

Intensity+Palette

Manipulation Results

Image variations when fixing sketch, depth, palette and/or embedding.

Image Variations

Image interpolations when fixing sketch, depth, segmentation map and/or palette.

Image Interpolations

Image reconfigurations (manipulating an image by directly modifying its elements).

Image Reconfigurations

Color interpolations.

Color Interpolations

Region-specific image editing.

Region-specific image editing

Reformulation of Classical Tasks

Image translation.

Image Translation

Style transfer.

Style Transfer

Pose transfer.

Pose Transfer

Virtual try-on.

Virtual Try-on

BibTeX

@article{lhhuang2023composer,
  title={Composer: Creative and Controllable Image Synthesis with Composable Conditions},
  author={Huang, Lianghua and Chen, Di and Liu, Yu and Yujun, Shen and Zhao, Deli and Jingren, Zhou},
  booktitle={arXiv preprint arxiv:2302.09778},
  year={2023}
}