This repo is the official *Stable-Diffusion-webui extension version implementation of "DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via Contrastive Prompt-Tuning" with Stable-Diffusion-webui.
Standalone version: DreamArtist
Everyone is an artist. Rome wasn't built in a day, but your artist dreams can be!
With just one training image DreamArtist learns the content and style in it, generating diverse high-quality images with high controllability. Embeddings of DreamArtist can be easily combined with additional descriptions, as well as two learned embeddings.
Clone this repo to extension folder.
git clone https://github.com/7eu7d7/DreamArtist-sd-webui-extension.git extensions/DreamArtist
First create the positive and negative embeddings in DreamArtist Create Embedding
Tab.
After that, the names
of the positive and negative embedding ({name}
and {name}-neg
) should be filled into the
txt2img Tab
with some common descriptions. This will ensure a correct preview image.
Then, select positive embedding and set the parameters and image folder path in the DreamArtist Train
Tab to start training.
The corresponding negative embedding is loaded automatically.
If your VRAM is low or you want save time, you can uncheck the reconstruction
.
better to train without filewords
Remember to check the option below, otherwise the preview is wrong.
Fill the trained positive and negative embedding into txt2img to generate with DreamArtist prompt.
Attention Mask can strengthen or weaken the learning intensity of some local areas. Attention Mask is a grayscale image whose grayscale values are related to the learning intensity show in the following table.
grayscale | 0% | 25% | 50% | 75% | 100% |
---|---|---|---|---|---|
intensity | 0% | 50% | 100% | 300% | 500% |
The Attention Mask is in the same folder as the training image and its name is the name of the training image + "_att". You can choose whether to enable Attention Mask for training.
Since there is a self-attention operation in VAE, it may change the distribution of features. In the Process Att-Map tab, it can superimpose the attention map of self-attention on the original Att-Map.
Dynamic CFG can improve the performance, especially when the data set is large (>20). For example, linearly from 1.5 to 3.0 (1.5-3.0), or with a 0-π/2 cycle of cosine (1.5-3.0:cos), or with a -π/2-0 cycle of cosine (1.5-3.0:cos2). Or you can also customize non-linear functions, such as 2.5-3.5:torch.sqrt(rate), where rate is a variable from 0-1.
- Stable Diffusion v1.4
- Stable Diffusion v1.5
- animefull-latest
- Anything v3.0
- momoko-e
Embeddings can be transferred between different models of the same dataset.