/stable-diffusion-improvements

Working branch for stable diffusion production improvements

Primary LanguageJupyter NotebookOtherNOASSERTION

Stable Diffusion Improvements

Goals

The focus of this fork of Stable Diffusion is to improve:

  1. Workflow
    • Sensible GUI with shortcuts and launch arguments at your fingertips
    • Quicker iteration times (model stays in-memory, among other improvements)
  2. Determinism
    • Enhanced determinism of seeding per-sample, singly-indexable subsamples
    • Sequential seed stepping per iteration
  3. Usability
    • Accessibility through UI
    • More concise and fool-proof installation

interactive_usagedemo

Disclaimer

Some things to note which should be common sense, but by downloading these files you hereby agree:

  • I'm not responsible in any way for what you choose to generate with this, nor do I own any such product.
    • Beware, certain model weights you download from various sources may be capable of producing disturbing imagery.
    • You must read and understand the relevant license of any model you use.
  • This is a hacked-together work-in-progress.
    • Expect things to break. Frequently.
    • Be kind and collaborative in discussions online about Stable Diffusion, and other similar tools.

Installation

Requirements

  • Modern NVIDIA GPU with > 10GB VRAM

Windows

Install Miniconda3, a minimized Python virtual environment manager for Windows.

From the Start Menu, run Anaconda Prompt (miniconda3). This is the shell that you should execute all further shell commands within.

Create the environment (automatically fetches dependencies) from the base of this repository:

conda env create -f environment.yaml
conda activate ldm

Place your obtained model.ckpt file in new folder: .\stable-diffusion-improvements\models\ldm\stable-diffusion-v1

If it's not named model.ckpt, and is instead something like sd-v1-4.ckpt, just rename it.

Usage

Using the interactive mode is straightforward. Simply call (with ldm conda env active):

python scripts\txt2img.py --interactive

Any classic arguments passed to txt2img.py will show up in the interactive view as parameter textbox/checkbox defaults. Await the activation of the Generate button, and create to your heart's content.

More info and features coming soon, so keep your repository up to date.

Acknowledgements

From Stability.ai:

Stable Diffusion was made possible thanks to a collaboration with Stability AI and Runway and builds upon our previous work:

High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach*, Andreas Blattmann*, Dominik Lorenz, Patrick Esser, Björn Ommer

CVPR '22 Oral

which is available on GitHub. PDF at arXiv. Please also visit our Project page.

txt2img-stable2 Stable Diffusion is a latent text-to-image diffusion model. Thanks to a generous compute donation from Stability AI and support from LAION, we were able to train a Latent Diffusion Model on 512x512 images from a subset of the LAION-5B database. Similar to Google's Imagen, this model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. See this section below and the model card.