/diffuser_layerdiffuse

Create transparent image with Diffusers!

Primary LanguagePythonMIT LicenseMIT

Diffusers API of Transparent Image Layer Diffusion using Latent Transparency

🤗 Hugging Face: Hugging Face Spaces 🔥🔥🔥

Create transparent image with Diffusers!

This is a port to Diffuser from original SD Webui's Layer Diffusion to extend the ability to generate transparent image with your favorite API

Paper: Transparent Image Layer Diffusion using Latent Transparency

Setup

pip install -r requirements.txt

Quickstart

Generate transparent image with SD1.5 models. In this example, we will use digiplay/Juggernaut_final as the base model

Stable Diffusion 1.5

from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
import torch

from diffusers import StableDiffusionPipeline

from models import TransparentVAEDecoder
from loaders import load_lora_to_unet


model_path = hf_hub_download(
    'LayerDiffusion/layerdiffusion-v1',
    'layer_sd15_vae_transparent_decoder.safetensors',
)

vae_transparent_decoder = TransparentVAEDecoder.from_pretrained("digiplay/Juggernaut_final", subfolder="vae", torch_dtype=torch.float16).to("cuda")
vae_transparent_decoder.set_transparent_decoder(load_file(model_path))

pipeline = StableDiffusionPipeline.from_pretrained("digiplay/Juggernaut_final", vae=vae_transparent_decoder, torch_dtype=torch.float16, safety_checker=None).to("cuda")

model_path = hf_hub_download(
    'LayerDiffusion/layerdiffusion-v1',
    'layer_sd15_transparent_attn.safetensors'
)

load_lora_to_unet(pipeline.unet, model_path, frames=1)

image = pipeline(prompt="a dog sitting in room, high quality", 
                    width=512, height=512,
                    num_images_per_prompt=1, return_dict=False)[0]

Would produce the below image

demo_result

Stable Diffusion XL

It's a LoRA and will compatible with any Diffusers usage: ControlNet, IPAdapter, etc.

from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
import torch

from diffusers import StableDiffusionXLPipeline

from models import TransparentVAEDecoder


transparent_vae = TransparentVAEDecoder.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
model_path = hf_hub_download(
    'LayerDiffusion/layerdiffusion-v1',
    'vae_transparent_decoder.safetensors',
)
transparent_vae.set_transparent_decoder(load_file(model_path))

pipeline = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", 
    vae=transparent_vae,
    torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda")
pipeline.load_lora_weights('rootonchair/diffuser_layerdiffuse', weight_name='diffuser_layer_xl_transparent_attn.safetensors')

seed = torch.randint(high=1000000, size=(1,)).item()
prompt = "a cute corgi"
negative_prompt = ""
images = pipeline(prompt=prompt, 
                    negative_prompt=negative_prompt,
                    generator=torch.Generator(device='cuda').manual_seed(seed),
                    num_images_per_prompt=1, return_dict=False)[0]

images[0].save("result_sdxl.png")

Scripts

  • test_diffusers_fg_only.py: Only generate transparent foreground image
  • test_diffusers_joint.py: Generate foreground, background, blend image together. Hence num_images_per_prompt must be batch size of 3
  • test_diffusers_fg_only_sdxl.py: Only generate transparent foreground image using Attention injection in SDXL
  • test_diffusers_fg_only_conv_sdxl.py: Only generate transparent foreground image using Conv injection in SDXL

It is said by the author that Attention injection would result in better generation quality and Conv injection would result in better prompt alignment

Example

Stable Diffusion 1.5

Generate only transparent image with SD1.5

demo_dreamshaper

Generate foreground and background together

Foreground Background Blended
fg bg blend

Use with ControlNet

controlnet

Use with IP-Adapter

ip_adapter

Stable Diffusion XL

Combine with other LoRAs

Combine with SDXL Lora nerijs/pixel-art-xl

Attn Injection (LoRA) Conv Injection (Weight diff)
sdxl_attn sdxl_conv

Acknowledgments

This work is based on the great code at https://github.com/layerdiffusion/sd-forge-layerdiffuse