/Vector_Sculptor_ComfyUI

Gather similar vectors within the CLIP weights and use them to redirect the original weights

Primary LanguagePython

The main node makes your conditioning go towards similar concepts so to enrich your composition or further away so to make it more precise.

It gathers similar pre-cond vectors for as long as the cosine similarity score diminishes. If it climbs back it stops. This allows to set a relative direction to similar concepts.

MNeMoNiCuZ also shared his tests

The nodes:

image

From Backward 5 to forward 1.5 "average keep magnitude" up until the middle and slerp the way back:

slurping.mp4

Vector sculptor text encode:

image

Does what I described above.

  • Sculptor Intensity: How strong the effect will be. Forward is best from 0 to 1 for photorealistic images anb 1 to 2 for more artistic purpose.
  • Sculptor method:
    • forward: Subtract the nearest vectors. Going above 1 might have adversarial effects.
    • backward: Add them instead.
    • maximum_absolute: Normalize the vectors and selects the values that are the furthest from 0. The intensity has no effect here besides disabling if set at 0. This tends to make the compositions more complex on simple subjects and more chaotic on complex prompts. Can be beneficial or not depending on what. It's mostly for fun but can give extremely nice results with abstract concepts.
  • Token normalization: reworks the magnitude of each vectors. I recommand either "none" which leaves things by default or "mean" so to set every token's importance to their overall mean value. "set at 1" will set them all at 1 and I have no idea if this is a good idea. "mean of all tokens" will take the mean value of EVERY vectors within the pre-cond weights and is probably a bad idea but also why not.

If the intensity is set at 0 the token's normalization is still into effect. Setting it at 0 and selecting "none" will return a default Comfy conditioning.

Both directions offer valid variations no matter the subject.

For general use I recommand forward at 0.5 for the positive prompt and to "stay in place" for the negative.

Normalizing the tokens magnitudes at their mean seems to have a positive effect too. Especially with the negative prompt which tends to lower my ratio of burned images.

Conditioning (Slerp) and Conditioning (Average keep magnitude):

image

Since we are working with vectors, doing weighted averages might be the reason why things might feel "dilute" sometimes:

"Conditioning (Average keep magnitude)" is a cheap slerp which does a weighted average with the conditionings and their magnitudes.

"Conditioning (Slerp)" will do an actual slerp and might be preferable.

image

With an average we're losing alti magnitude.

"Conditioning normalize magnitude to empty:

image

Makes the overall intensity of the conditioning to the one of an empty cond. I have no idea if this is an actually good idea. It tends to give more balanced images regarding the colors and contrasts but also gave me more molten people.

If using SDXL the values are worked separately for clip_g and clip_l

Examples (same seed side by side):

SD 1.x

00545UI_00001_

00724UI_00001_

00608UI_00001_

SDXL:

01785UI_00001_

01796UI_00001_

01816UI_00001_

Forward at 0.5~1.0 it seems to cure the "always the same face" effect:

01324UI_00001_

With a lower intensity the effect can still be seen without necessarily changing the composition:

01489UI_00001_

01515UI_00001_

"dark enchanted forest with colorful glowing lights, digital painting, night, black amoled wallpaper, wintery fog, fantasy"

01851UI_00001_

01847UI_00001_

Too much forward will overreach your general meaning and become more unpredictable:

01823UI_00001_

01835UI_00001_

image

2024-03-28-18_31_19 - basic 001

More on SD 1.5:

part_1

part_2

part_3

My examples are all workflows.

If you want to try them you will need a few of my random useful nodes.

You can find all of them here, here and here

Note:

I make these nodes to try to understand things deeper. My maths can sometimes be wrong. Everything I do is self-taught with an overly top-to-bottom approach.

Some things like the fact that I'm not turning the cosine similarity score into a linear value might look like an oversight but doing so tends to diminish the effect of the node too strongly and is therefore a deliberate choice.

I feel that what I have done might also be done by tweaking the activation functions but I haven't got that far. Yet.

Also kittens seems immune to this transformation and inherits little suits before turning into AI chaos.

kit

Pro tip:

Did you know that my first activity is to write creative model merging functions?

While the code is too much of a mess to be shared, I do expose and share my models. You can find them in this gallery! 😁