/sd-webui-diffusion-cg

An Extension for Automatic1111 Webui that performs color grading based on the latent tensor value range

Primary LanguagePythonMIT LicenseMIT

SD Webui Diffusion Color Grading

This is an Extension for the Automatic1111 Webui, which performs Color Grading during the generation, producing a more neutral and balanced, but also vibrant and contrasty color.

This is the fruition of the joint research between TimothyAlexisVass with their findings, and me with my experience in developing Vectorscope CC

Note: This Extension is disabled during ADetailer phase to prevent inconsistent colors

Features

This Extension comes with two main effects, Recenter and Normalization:

Recenter

Abstract

TimothyAlexisVass discovered that, the value of the latent noise Tensor often starts off-centered, and the mean of each channel tends to drift away from 0. Therefore, I wrote a function to guide the mean back to 0.

Effects

When you enable the feature, the output images will not have a biased color tint, and all colors will distribute more evenly; Additionally, the brightness will be adjusted so that bright areas are not overblown and dark areas are not clipped, producing an effect similar to HDR photos.

Samples


Off | On

Normalization

Abstract

By encoding images back into latent noise using VAE, TimothyAlexisVass discovered that the resulting values usually fall within a certain range, and thus theorized that if the final latent noise has a smaller value range than normal, then some precision is essentailly wasted. This gave me an idea to write a function that make the latent noise utilize the full depth.

Effects

When you enable the feature, the latent noise will attempt to span across the full value range if possible, before getting decoded by the VAE. As a result, bright areas will get brighter and dark areas will get darker, and additional details may also be introduced in these areas.


Off | On

Both features increase the image file size when enabled, suggesting that they "contain more informations"

Misc.

  • You can enable both features at the same time to generate some stunning images
  • This Extension supports both SD 1.5 and SDXL checkpoints


Off | On

Settings

In the Diffusion CG section under the Stable Diffusion category in the Settings tab, you can make either feature default to enable, as well as setting the Stable Diffusion architecture to start with.


Structures of Stable Diffusion

The Tensor of the latent noise has a dimention of [batch, 4, height / 8, width / 8].

  • For SD 1.5: From my trial and error when developing Vectorscope CC, each of the 4 channels essentially represents the -K, -M, C, Y color for the CMYK color model.

  • For SDXL: According to TimothyAlexisVass's Blogpost, the first 3 channels represent the Y', -Cr, -Cb color for the Y'CbCr color model, while the 4th channel is the pattern/structure.