LinFusion: 1 GPU, 1 Minute, 16K Image
Songhua Liu, Weuhao Yu, Zhenxiong Tan, and Xinchao Wang
Learning and Vision Lab, National University of Singapore
[2024/09/08] We release codes for 16K image generation here!
[2024/09/05] Gradio demo for SD-v1.5 is released! Text-to-image, image-to-image, and IP-Adapter are supported currently.
-
Clone this repo to your project directory:
git clone https://github.com/Huage001/LinFusion.git
-
You only need two lines!
from diffusers import AutoPipelineForText2Image import torch + from src.linfusion import LinFusion sd_repo = "Lykon/dreamshaper-8" pipeline = AutoPipelineForText2Image.from_pretrained( sd_repo, torch_dtype=torch.float16, variant="fp16" ).to(torch.device("cuda")) + linfusion = LinFusion.construct_for(pipeline) image = pipeline( "An astronaut floating in space. Beautiful view of the stars and the universe in the background.", generator=torch.manual_seed(123) ).images[0]
LinFusion.construct_for(pipeline)
will return a LinFusion model that matches the pipeline's structure. And this LinFusion model will automatically mount to the pipeline's forward function. -
examples/basic_usage.ipynb
shows a basic text-to-image example.
- From the perspective of efficiency, our method supports high-resolution generation such as 16K images. Nevertheless, directly applying diffusion models trained on low resolutions for higher-resolution generation can result in content distortion and duplication. To tackle this challenge, we apply techniques in SDEdit. The basic idea is to generate a low-resolution result at first, based on which we gradually upscale the image. Please refer to
examples/ultra_text2image_w_sdedit.ipynb
for an example. Note that 16K generation is only currently available for 80G GPUs. We will try to relax this constraint by implementing tiling strategies. - We are working on integrating LinFusion with more advanced approaches that are dedicated on high-resolution extension!
- Stable Diffusion 1.5 support.
- Stable Diffusion 2.1 support.
- Stable Diffusion XL support.
- Release training code for LinFusion.
- Release evaluation code for LinFusion.
If you finds this repo is helpful, please consider cite:
@article{liu2024linfusion,
title = {LinFusion: 1 GPU, 1 Minute, 16K Image},
author = {Liu, Songhua and Yu, Weihao and Tan, Zhenxiong and Wang, Xinchao},
year = {2024},
eprint = {2409.02097},
archivePrefix={arXiv},
primaryClass={cs.CV}
}