Fine-tuning Stable Diffusion and Prompt-to-Prompt Editing Techniques

In this project, I use a fine-tuning technique that lets me train text-to-image diffusion models on a concept like a character or style, called Dreambooth. Dreambooth allows the model to generate contextualized images of the subject in different scenes, poses, and views. You can find the implementation of Dreambooth here.

Dreambooth is a method to personalize text-to-image models given just a few (4-6) images of a subject. However a Stable Diffusion community has found that using 10 to 12 images leads to better results. Consequently, I fine-tuned a model called 'runwayml/stable-diffusion-v1-5' with two sets of images, each containing 12 images, featuring two of my friends identified as 'bnh' and 'tuki.' The respective priors for preservation classes were denoted as 'Keanu Reeves' and 'Justin Bieber' (human class).

bnh keanu reeves...
tuki justin bieber...

Subsequently, I apply various Prompt-to-Prompt text-based editing operations. Prompt-to-prompt provides users with simple and intuitive means to edit images, leveraging textual semantic power while preserving the original composition and structure.

	photo of ... wearing a pair of sunglasses on a beach	van gogh painting of ... wearing a pair of sunglasses on a beach
...bnh keanu reeves...
	photo of ... wearing a pair of sunglasses and taking a selfie in front of a mirror	van gogh painting of ... wearing a pair of sunglasses and taking a selfie in front of a mirror
tuki justin bieber...

Text-Only Localized Editing

Localized editing involves modifying the user-provided prompt, enabling us to preserve the spatial layout, geometry, and semantics.

	burger	cake
A painting of a bnh keanu reeves eating a...
	burger	lasagne
A painting of a tuki justin bieber eating a...
	Original	W.o prompt-to-prompt	prompt-to-prompt

Global Editing

Global editing affects all parts of the image, but still retain the original composition.

(Original)drawing of ... on a snowy mountain	van gogh painting of ... on a snowy mountain	van gogh painting of ... in the jungle	van gogh painting of ... in a river	van gogh painting of ... in the desert

...a bnh keanu reeves...

(Original)photo of ...	charocal painting of ...	impressionism painting of ...	neo classical painting of ...	watercolor painting of ...

...a tuki justin bieber wearing a sunglasses in a forest

Attention Re-weighting

By re-scaling the attention of the specified word, we can control the extent to which it influences the generated image.


bnh keanu reeves wearing a pair of sunglasses under a blossom(↓) tree


tuki justin bieber wearing a pair of sunglasses under a blossom(↑) tree

khoi03/Fine-tuning-Stable-Diffusion-and-Prompt-to-Prompt

Fine-tuning Stable Diffusion and Prompt-to-Prompt Editing Techniques

Text-Only Localized Editing

Global Editing

Attention Re-weighting