This is an Extension for the Automatic1111 Webui, which performs Color Grading during the generation, producing a more neutral and balanced, but also vibrant and contrasty color.
This is the fruition of the joint research between TimothyAlexisVass with their findings, and me with my experience in developing Vectorscope CC
Note: This Extension is disabled during ADetailer phase to prevent inconsistent colors
This Extension comes with two main effects, Recenter and Normalization:
TimothyAlexisVass discovered that, the value of the latent noise Tensor often starts off-centered, and the mean of each channel tends to drift away from 0
. Therefore, I wrote a function to guide the mean back to 0
.
When you enable the feature, the output images will not have a biased color tint, and all colors will distribute more evenly; Additionally, the brightness will be adjusted so that bright areas are not overblown and dark areas are not clipped, producing an effect similar to HDR photos.
By encoding images back into latent noise using VAE, TimothyAlexisVass discovered that the resulting values usually fall within a certain range, and thus theorized that if the final latent noise has a smaller value range than normal, then some precision is essentailly wasted. This gave me an idea to write a function that make the latent noise utilize the full depth.
When you enable the feature, the latent noise will attempt to span across the full value range if possible, before getting decoded by the VAE. As a result, bright areas will get brighter and dark areas will get darker, and additional details may also be introduced in these areas.
Both features increase the image file size when enabled, suggesting that they "contain more informations"
- You can enable both features at the same time to generate some stunning images
- This Extension supports both
SD 1.5
andSDXL
checkpoints
In the Diffusion CG
section under the Stable Diffusion category in the Settings tab, you can make either feature default to enable
, as well as setting the Stable Diffusion architecture to start with.
Structures of Stable Diffusion
The Tensor
of the latent noise has a dimention of [batch, 4, height / 8, width / 8]
.
-
For SD 1.5: From my trial and error when developing Vectorscope CC, each of the 4 channels essentially represents the
-K
,-M
,C
,Y
color for the CMYK color model. -
For SDXL: According to TimothyAlexisVass's Blogpost, the first 3 channels represent the
Y'
,-Cr
,-Cb
color for the Y'CbCr color model, while the 4th channel is the pattern/structure.