DiffSynth Studio

Introduction

DiffSynth Studio is a Diffusion engine. We have restructured architectures including Text Encoder, UNet, VAE, among others, maintaining compatibility with models from the open-source community while enhancing computational performance. We provide many interesting features. Enjoy the magic of Diffusion models!

Until now, DiffSynth Studio has supported the following models:

News

August 22, 2024 We have implemented an interesting painter that supports all text-to-image models. Now you can create stunning images using the painter, with assistance from AI!
- Use it in our WebUI.
August 21, 2024 FLUX is supported in DiffSynth-Studio.
- Enable CFG and highres-fix to improve visual quality. See here
- LoRA, ControlNet, and additional models will be available soon.
June 21, 2024. 🔥🔥🔥 We propose ExVideo, a post-tuning technique aimed at enhancing the capability of video generation models. We have extended Stable Video Diffusion to achieve the generation of long videos up to 128 frames.
- Project Page
- Source code is released in this repo. See examples/ExVideo.
- Models are released on HuggingFace and ModelScope.
- Technical report is released on arXiv.
- You can try ExVideo in this Demo!
June 13, 2024. DiffSynth Studio is transferred to ModelScope. The developers have transitioned from "I" to "we". Of course, I will still participate in development and maintenance.
Jan 29, 2024. We propose Diffutoon, a fantastic solution for toon shading.
- Project Page
- The source codes are released in this project.
- The technical report (IJCAI 2024) is released on arXiv.
Dec 8, 2023. We decide to develop a new Project, aiming to release the potential of diffusion models, especially in video synthesis. The development of this project is started.
Nov 15, 2023. We propose FastBlend, a powerful video deflickering algorithm.
- The sd-webui extension is released on GitHub.
- Demo videos are shown on Bilibili, including three tasks.
- The technical report is released on arXiv.
- An unofficial ComfyUI extension developed by other users is released on GitHub.
Oct 1, 2023. We release an early version of this project, namely FastSDXL. A try for building a diffusion engine.
- The source codes are released on GitHub.
- FastSDXL includes a trainable OLSS scheduler for efficiency improvement.
  - The original repo of OLSS is here.
  - The technical report (CIKM 2023) is released on arXiv.
  - A demo video is shown on Bilibili.
  - Since OLSS requires additional training, we don't implement it in this project.
Aug 29, 2023. We propose DiffSynth, a video synthesis framework.
- Project Page.
- The source codes are released in EasyNLP.
- The technical report (ECML PKDD 2024) is released on arXiv.

Installation

Install from source code (recommended):

git clone https://github.com/modelscope/DiffSynth-Studio.git
cd DiffSynth-Studio
pip install -e .

Or install from pypi:

pip install diffsynth

Usage (in Python code)

The Python examples are in examples. We provide an overview here.

Download Models

Download the pre-set models. Model IDs can be found in config file.

from diffsynth import download_models

download_models(["FLUX.1-dev", "Kolors"])

Download your own models.

from diffsynth.models.downloader import download_from_huggingface, download_from_modelscope

# From Modelscope (recommended)
download_from_modelscope("Kwai-Kolors/Kolors", "vae/diffusion_pytorch_model.fp16.bin", "models/kolors/Kolors/vae")
# From Huggingface
download_from_huggingface("Kwai-Kolors/Kolors", "vae/diffusion_pytorch_model.fp16.safetensors", "models/kolors/Kolors/vae")

Video Synthesis

Long Video Synthesis

We trained an extended video synthesis model, which can generate 128 frames. examples/ExVideo

github_title.mp4

Toon Shading

Render realistic videos in a flatten style and enable video editing features. examples/Diffutoon

Diffutoon.mp4

Diffutoon_edit.mp4

Video Stylization

Video stylization without video models. examples/diffsynth

winter_stone.mp4

Image Synthesis

Generate high-resolution images, by breaking the limitation of diffusion models! examples/image_synthesis.

LoRA fine-tuning is supported in examples/train.

FLUX	Stable Diffusion 3

Kolors	Hunyuan-DiT

Stable Diffusion	Stable Diffusion XL

Usage (in WebUI)

Create stunning images using the painter, with assistance from AI!

video.mp4

This video is not rendered in real-time.

Before launching the WebUI, please download models to the folder ./models. See here.

Gradio version

pip install gradio

python apps/gradio/DiffSynth_Studio.py

Streamlit version

pip install streamlit streamlit-drawable-canvas

python -m streamlit run apps/streamlit/DiffSynth_Studio.py

sdxl_turbo_ui.mp4

JadeLuo/DiffSynth-Studio