LinFusion

LinFusion: 1 GPU, 1 Minute, 16K Image
Songhua Liu, Weuhao Yu, Zhenxiong Tan, and Xinchao Wang
Learning and Vision Lab, National University of Singapore

🔥News

[2024/09/08] We release codes for 16K image generation here!

[2024/09/05] Gradio demo for SD-v1.5 is released! Text-to-image, image-to-image, and IP-Adapter are supported currently.

Supported Models

Yuanshi/LinFusion-1-5: For Stable Diffusion 1.5 and its variants.

Quick Start

If you have not, install PyTorch and diffusers.

Clone this repo to your project directory:

git clone https://github.com/Huage001/LinFusion.git

You only need two lines!

from diffusers import AutoPipelineForText2Image
import torch

+ from src.linfusion import LinFusion

sd_repo = "Lykon/dreamshaper-8"

pipeline = AutoPipelineForText2Image.from_pretrained(
    sd_repo, torch_dtype=torch.float16, variant="fp16"
).to(torch.device("cuda"))

+ linfusion = LinFusion.construct_for(pipeline)

image = pipeline(
    "An astronaut floating in space. Beautiful view of the stars and the universe in the background.",
    generator=torch.manual_seed(123)
).images[0]

LinFusion.construct_for(pipeline) will return a LinFusion model that matches the pipeline's structure. And this LinFusion model will automatically mount to the pipeline's forward function.

examples/basic_usage.ipynb shows a basic text-to-image example.

Ultrahigh-Resolution Generation

From the perspective of efficiency, our method supports high-resolution generation such as 16K images. Nevertheless, directly applying diffusion models trained on low resolutions for higher-resolution generation can result in content distortion and duplication. To tackle this challenge, we apply techniques in SDEdit. The basic idea is to generate a low-resolution result at first, based on which we gradually upscale the image. Please refer to examples/ultra_text2image_w_sdedit.ipynb for an example. Note that 16K generation is only currently available for 80G GPUs. We will try to relax this constraint by implementing tiling strategies.
We are working on integrating LinFusion with more advanced approaches that are dedicated on high-resolution extension!

ToDo

Stable Diffusion 1.5 support.
Stable Diffusion 2.1 support.
Stable Diffusion XL support.
Release training code for LinFusion.
Release evaluation code for LinFusion.

Citation

If you finds this repo is helpful, please consider cite:

@article{liu2024linfusion,
  title     = {LinFusion: 1 GPU, 1 Minute, 16K Image},
  author    = {Liu, Songhua and Yu, Weihao and Tan, Zhenxiong and Wang, Xinchao},
  year      = {2024},
  eprint    = {2409.02097},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}