/Auto1111SDK

An SDK/Python library for Automatic 1111 to run state-of-the-art diffusion models

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

Auto 1111 SDK: Stable Diffusion Python library

GitHub release

Auto 1111 SDK is a lightweight Python library for using Stable Diffusion generating images, upscaling images, and editing images with diffusion models. It is designed to be a modular, light-weight Python client that encapsulates all the main features of the [Automatic 1111 Stable Diffusion Web Ui](https://github.com/AUTOMATIC1111/stable-diffusion-webui). Auto 1111 SDK offers 3 main core features currently:
  • Text-to-Image, Image-to-Image, Inpainting, and Outpainting pipelines. Our pipelines support the exact same parameters as the Stable Diffusion Web UI, so you can easily replicate creations from the Web UI on the SDK.
  • Upscaling Pipelines that can run inference for any Esrgan or Real Esrgan upscaler in a few lines of code.
  • An integration with Civit AI to directly download models from the website.

Join our Discord!!

Demo

We have a colab demo where you can run many of the operations of Auto 1111 SDK. Check it out here!!

Installation

We recommend installing Auto 1111 SDK in a virtual environment from PyPI. Right now, we do not have support for conda environments yet.

pip3 install auto1111sdk

To install the latest version of Auto 1111 SDK (with controlnet now included), run:

pip3 install git+https://github.com/saketh12/Auto1111SDK.git

Quickstart

Generating images with Auto 1111 SDK is super easy. To run inference for Text-to-Image, Image-to-Image, Inpainting, Outpainting, or Stable Diffusion Upscale, we have 1 pipeline that can support all these operations. This saves a lot of RAM from having to create multiple pipeline objects with other solutions.

from auto1111sdk import StableDiffusionPipeline

pipe = StableDiffusionPipeline("<Path to your local safetensors or checkpoint file>")

prompt = "a picture of a brown dog"
output = pipe.generate_txt2img(prompt = prompt, height = 1024, width = 768, steps = 10)

output[0].save("image.png")

Controlnet

Right now, Controlnet only works with fp32. We are adding support for fp16 very soon.

from auto1111sdk import StableDiffusionPipeline
from auto1111sdk import ControlNetModel

model = ControlNetModel(model="<THE CONTROLNET MODEL FILE NAME (WITHOUT EXTENSION)>", 
                   image="<PATH TO IMAGE>")

pipe = StableDiffusionPipeline("<Path to your local safetensors or checkpoint file>", controlnet=model)

prompt = "a picture of a brown dog"
output = pipe.generate_txt2img(prompt = prompt, height = 1024, width = 768, steps = 10)

output[0].save("image.png")

Running on Windows

Find the instructions here. Contributed by by Marco Guardigli, mgua@tomware.it

Documentation

We have more detailed examples/documentation of how you can use Auto 1111 SDK here. For a detailed comparison between us and Huggingface diffusers, you can read this.

For a detailed guide on how to use SDXL, we recommend reading this

Features

  • Original txt2img and img2img modes
  • Real ESRGAN upscale and Esrgan Upscale (compatible with any pth file)
  • Outpainting
  • Inpainting
  • Stable Diffusion Upscale
  • Attention, specify parts of text that the model should pay more attention to
    • a man in a ((tuxedo)) - will pay more attention to tuxedo
    • a man in a (tuxedo:1.21) - alternative syntax
    • select text and press Ctrl+Up or Ctrl+Down (or Command+Up or Command+Down if you're on a MacOS) to automatically adjust attention to selected text (code contributed by anonymous user)
  • Composable Diffusion: a way to use multiple prompts at once
    • separate prompts using uppercase AND
    • also supports weights for prompts: a cat :1.2 AND a dog AND a penguin :2.2
  • Works with a variety of samplers
  • Download models directly from Civit AI and RealEsrgan checkpoints
  • Set custom VAE: works for any model including SDXL
  • Support for SDXL with Stable Diffusion XL Pipelines
  • Pass in custom arguments to the models
  • No 77 prompt token limit (unlike Huggingface Diffusers, which has this limit)

Roadmap

  • Adding support Hires Fix and Refiner parameters for inference.
  • Adding support for Lora's
  • Adding support for Face restoration
  • Adding support for Dreambooth training script.
  • Adding support for custom extensions like Controlnet.

We will be adding support for these features very soon. We also accept any contributions to work on these issues!

Contributing

Auto1111 SDK is continuously evolving, and we appreciate community involvement. We welcome all forms of contributions - bug reports, feature requests, and code contributions.

Report bugs and request features by opening an issue on Github. Contribute to the project by forking/cloning the repository and submitting a pull request with your changes.

Credits

Licenses for borrowed code can be found in Settings -> Licenses screen, and also in html/licenses.html file.