/Dreambooth

Fine-tuning of diffusion models

Primary LanguagePythonApache License 2.0Apache-2.0

Fine-tuning of Stable Diffusion models

Run Dreambooth or Low-rank Adaptation (LoRA) from the same notebook:

Open In Colab

$~$

Tested with Tesla T4 and A100 GPUs on Google Colab (some settings will not work on T4 due to limited memory)

Tested with Stable Diffusion v1-5 and Stable Diffusion v2-base.

This notebook borrows elements from ShivamShrirao's implementation, but is distinguished by some features:

  • Based on main Hugging Face Diffusers🧨 so it's easy to stay up-to-date
  • Low-rank Adaptation (LoRA) for faster and more efficient fine-tuning (using cloneofsimo's implementation)
  • Data augmentation such as random cropping, flipping and resizing, which can minimize manually prepping and cropping images in certain cases (e.g., training a style)
  • More parameters for experimentation (modify LoRA rank approximation, ADAM optimizer parameters, cosine_with_restarts learning rate scheduler, etc), all of which are dumped to a json file so you can remember what you did
  • Drop some text-conditioning to improve classifier-free guidance sampling (e.g., how SD V1-5 was fine-tuned)
  • Image captioning using filenames or associated textfiles
  • Training loss and prior class loss are tracked separately (can be visualized using tensorboard)
  • Option to generate exponentially-weighted moving average (EMA) weights for the unet
  • Inference with trained models uses Diffusers🧨 pipelines, does not rely on any web-apps

$~$

Image comparing Dreambooth and LoRA (more information here):

full-size image here for the pixel-peepers

Buy Me A Coffee