Inference Stable Diffusion with C# and ONNX Runtime

This repo contains the logic to do inferencing for the popular Stable Diffusion deep learning model in C#. Stable Diffusion models take a text prompt and create an image that represents the text. See the example below:

For the below example sentence the CLIP model creates a text embedding that connects text to image. A random noise image is created and then denoised with the unet model and scheduler algorithm to create an image that represents the text prompt. Lastly the decoder model vae_decoder is used to create a final image that is the result of the text prompt and the latent image.

"make a picture of green tree with flowers around it and a red sky"

Auto Generated Random Latent Seed Input	Resulting image output

More Images Created with this Repo:

| | |

Prerequisites

Download the Source Code from GitHub
Visual Studio or VS Code
A GPU enabled machine with CUDA EP Configured. This was built on a GTX 3070 and it has not been tested on anything smaller. Follow this tutorial to configure CUDA and cuDNN for GPU with ONNX Runtime and C# on Windows 11

Use Hugging Face to download the Stable Diffusion models

Download the ONNX Stable Diffusion models from Hugging Face.

Once you have selected a model version repo, click Files and Versions, then select the ONNX branch. If there isn't an ONNX model branch available, use the main branch and convert it to ONNX. See the ONNX conversion tutorial for PyTorch for more information.

Clone the repo:

git lfs install
git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 -b onnx

Copy the folders with the ONNX files to the C# project folder \StableDiffusion\StableDiffusion. The folders to copy are: unet, vae_decoder, text_encoder, safety_checker.

prasanthpul/StableDiffusion

Inference Stable Diffusion with C# and ONNX Runtime

More Images Created with this Repo:

Prerequisites

Use Hugging Face to download the Stable Diffusion models

Follow the full Stable Diffusion C# Tutorial for this Repo here

Resources