S2D2: Simple Stable Diffusion based on Diffusers

Diffusers-based simple generating image module with upscaling features for jupyter notebook, ipython or python interactive shell

Features

☑ ~~Just prepare safetensors files to go~~
☑ Run Hires.fix without AUTOMATIC1111/stable-diffusion-webui
- ☑ Latent Upscaler
- ☒ GAN models
☑ Multi-stage upscaling (extension of Hires.fix)
☑ LoRA
☑ Controlnet
☒ Multi-batch generation (Only single generation is supported)
☑ Usable Long-Prompt-Weighting without loading custom pipeline

Schedule

Add Multi-ControlNet

Getting Started

1. Install libraries

pip install -r requirements.txt

2. Prepare files of SD models and LoRA(option)

Place the files in the directory of your choice.

Ex.

3. Run jupyter notebook

cd s2d2
jupyter notebook

4. Import main class and load options(LoRA and VAE and Controlnet)

from s2d2 import StableDiffusionImageGenerator

# Setting Model (for pipeline.from_pretrained)
model_path = "/content/drive/MyDrive/Model/fantasticmix"

# Initialize Generator
# (You can use "Long Prompt" without lpw_stable_diffusion)
generator = StableDiffusionImageGenerator(
  model_path,
  vae_path=vae_path,
  controlnet_path="lllyasviel/sd-controlnet-scribble",
  device="cuda",
  is_enable_xformers=False,
  # custom_pipeline="lpw_stable_diffusion",
)
# Load LoRA (multi files)
generator.load_lora(r"C:\xxx\lora_1.safetensors", alpha=0.2)
generator.load_lora(r"C:\xxx\lora_2.safetensors", alpha=0.15)

5. Generate image using enhance features(Hires.fix and its extended upscaling)

from diffusers.utils import load_image
from controlnet_aux import HEDdetector

simage_name = '/content/drive/MyDrive/target.png'#@param {type:"string"}
hed = HEDdetector.from_pretrained('lllyasviel/Annotators')

simage = load_image(simage_name)
simage = hed(simage, scribble=True)

image = generator.diffusion_enhance(
          prompt,
          negative_prompt,
          controlnet_image=simage,
          scheduler_name="dpm++_2m_karras", # [1]
          num_inference_steps=20, # [2]
          num_inference_steps_enhance=20, # [3]
          guidance_scale=10,  # [4]
          width=700, # [5]
          height=500, # [6]
          seed=-1, # [7]
          upscale_target="latent", # [8] "latent" or "pil". pil mode is temporary implemented.
          interpolate_mode="bicubic", # [9]
          antialias=True, # [10]
          upscale_by=1.8, # [11]
          enhance_steps=2, # [12] 2=Hires.fix
          denoising_strength=0.60, # [13]
          output_type="pil", # [14] "latent" or "pil"
          decode_factor=0.15, # [15] Denominator when decoding latents. Used to adjust the saturation of the image during decoding.
          decode_factor_final=0.18215, # [16] Denominator when decoding final latents.
          )
image.save("generated_image.jpg) # or just "image" to display image in jupyter

Correspondence of web ui and parameters

Parameters

🚧🚧🚧🚧🚧Under construction🚧🚧🚧🚧🚧

Available schedulers are:

SCHEDULERS = {
    "unipc": diffusers.schedulers.UniPCMultistepScheduler,
    "euler_a": diffusers.schedulers.EulerAncestralDiscreteScheduler,
    "euler": diffusers.schedulers.EulerDiscreteScheduler,
    "ddim": diffusers.schedulers.DDIMScheduler,
    "ddpm": diffusers.schedulers.DDPMScheduler,
    "deis": diffusers.schedulers.DEISMultistepScheduler,
    "dpm2": diffusers.schedulers.KDPM2DiscreteScheduler,
    "dpm2-a": diffusers.schedulers.KDPM2AncestralDiscreteScheduler,
    "dpm++_2s": diffusers.schedulers.DPMSolverSinglestepScheduler,
    "dpm++_2m": diffusers.schedulers.DPMSolverMultistepScheduler,
    "dpm++_2m_karras": diffusers.schedulers.DPMSolverMultistepScheduler,
    "dpm++_sde": diffusers.schedulers.DPMSolverSDEScheduler,
    "dpm++_sde_karras": diffusers.schedulers.DPMSolverSDEScheduler,
    "heun": diffusers.schedulers.HeunDiscreteScheduler,
    "heun_karras": diffusers.schedulers.HeunDiscreteScheduler,
    "lms": diffusers.schedulers.LMSDiscreteScheduler,
    "lms_karras": diffusers.schedulers.LMSDiscreteScheduler,
    "pndm": diffusers.schedulers.PNDMScheduler,
}

Generated sample images

Used Counterfeit-V30.safetensors
Initial resolution: 696x496
Upscale factor: 1.8
Target resolution: 696x496(x1.8, nearest multiple of 8) = 1248x888

2-stage upscaling(Hires.fix)

N-stage upscaling(Ex.4)

Stepwise upscaling between the initial resolution and the target resolution.

Comparison of generating images without or with latent upscaling

Without latent upscaling: Single generation@696x496
With latent upcscaling: 2-stage generation(like Hires.fix, 696x496 to 1248x888)
Prompt: "1girl, solo, full body, blue eyes, looking at viewer, hairband, bangs, brown hair, long hair, smile, blue eyes, wine-red dress, outdoor, night, moonlight, castle, flowers, garden"
Negative prompt: "EasyNegative, extra fingers, fewer fingers, bad hands"

ootsuka-biz/S2D2-from_pretrained-custom