DiffSynth Studio is a Diffusion engine. We have restructured architectures including Text Encoder, UNet, VAE, among others, maintaining compatibility with models from the open-source community while enhancing computational performance. We provide many interesting features. Enjoy the magic of Diffusion models!
- Aug 29, 2023. I propose DiffSynth, a video synthesis framework.
- Project Page.
- The source codes are released in EasyNLP.
- The technical report (ECML PKDD 2024) is released on arXiv.
- Oct 1, 2023. I release an early version of this project, namely FastSDXL. A try for building a diffusion engine.
- The source codes are released on GitHub.
- FastSDXL includes a trainable OLSS scheduler for efficiency improvement.
- Nov 15, 2023. I propose FastBlend, a powerful video deflickering algorithm.
- Dec 8, 2023. I decide to develop a new Project, aiming to release the potential of diffusion models, especially in video synthesis.
- Jan 29, 2024. I propose Diffutoon, a fantastic solution for toon shading.
- Project Page.
- The source codes are released in this project.
- The technical report (IJCAI 2024) is released on arXiv.
- Until now, DiffSynth Studio has supported the following models:
Create Python environment:
conda env create -f environment.yml
We find that sometimes conda
cannot install cupy
correctly, please install it manually. See this document for more details.
Enter the Python environment:
conda activate DiffSynthStudio
python -m streamlit run DiffSynth_Studio.py
sdxl_turbo_ui.mp4
The Python examples are in examples
. We provide an overview here.
Generate high-resolution images, by breaking the limitation of diffusion models! examples/image_synthesis
512*512 | 1024*1024 | 2048*2048 | 4096*4096 |
---|---|---|---|
1024*1024 | 2048*2048 |
---|---|
Render realistic videos in a flatten style and enable video editing features. examples/Diffutoon
Diffutoon.mp4
Diffutoon_edit.mp4
Video stylization without video models. examples/diffsynth
winter_stone.mp4
Use Hunyuan-DiT to generate images with Chinese prompts. We also support LoRA fine-tuning of this model. examples/hunyuan_dit
Prompt: 少女手捧鲜花,坐在公园的长椅上,夕阳的余晖洒在少女的脸庞,整个画面充满诗意的美感
1024x1024 | 2048x2048 (highres-fix) |
---|---|
Prompt: 一只小狗蹦蹦跳跳,周围是姹紫嫣红的鲜花,远处是山脉
Without LoRA | With LoRA |
---|---|