Extraltodeus/ComfyUI-AutomaticCFG

Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Model

ericbeyer opened this issue · 7 comments

Have you seen this paper? Curious how it would apply to your method. https://arxiv.org/pdf/2404.07724.pdf

I guess you just found my next implementation challenge!

edit: also thanks :D

With SD-XL, we apply guidance at 50% of the sampling steps, corresponding to noise levels
σ ∈ (0.28, 5.42], with weight γ = 16. These parameters were chosen by visual inspection. The
beneficial interval is wider than in ImageNet, likely due to the more varied dataset used in the
training of SD-XL. Consequently, our method leads to over 20% speed-up due to a lower number of
unconditional model evaluations [1].

So far through my reading this might mean that enabling the conditioning near the middle seems beneficial. Combined with the idea of disabling it at the end like I do might end up speeding the generation a lot more. I wonder about how much disabling it overall can be detrimental to the quality. It reminds me of when I tried to use a conditioning which was starting multiplied by 0 at the start until 50% of the steps where it was reaching 100%. It would indeed give interesting results without bringing speed advantages due to my experimental implementation.

Here for 24 steps:

(0, 1.0), (1, 0.8341), (2, 0.6923), (3, 0.5717), (4, 0.4696), (5, 0.3835), (6, 0.3114), (7, 0.2511), (8, 0.2012), (9, 0.16), (10, 0.1263), (11, 0.0989), (12, 0.0767), (13, 0.0589), (14, 0.0448), (15, 0.0337), (16, 0.0251), (17, 0.0184), (18, 0.0133), (19, 0.0095), (20, 0.0066), (21, 0.0045), (22, 0.003), (23, 0.002), (24, 0.0)

50% of the steps as mentionned in the paper enables the guidance at around 10% of the sigmas.

I added the possibility for a later guidance activation in the following commit. Set "uncond_start_percentage" to 10% with the advanced node to enable guidance at 50% of the steps as mentionned by the paper.

By experimenting around, if combined with the late deactivation, setting the delayed activation at 50% of the sigmas allows 2x speed for the first ~10% of the steps on top of the later sped up steps.

image

Just a quick screen so to be clear on how to setup the node. Set the "sigma_boost_percentage" to 0 if you want to experiment the idea of the paper without the late deactivation.

Just adding that setting it at 50% of the sigmas as shown above, combined with the late deactivation, seems to give more various results without penalising too much the resulting quality. Even with different schedulers such as exponential.

Very cool stuff! I'll try it out. Thanks! Yeah I was curious how the method would work for different schedulers.