About multi-resolution noise

Question

About multi-resolution noise

YangHai-1218 opened this issue 3 months ago · 1 comments

Hey, thanks for your excellent work.
As Marigold and GeoWizward both use multi-resolution noise to achieve faster and stable convergence, it would be great if you can provide a pseudo code to describe the generation of multi-resolution noise? I want to make sure I get it right in my implementation.
Thanks a lot!

Answer 1 · 2024-04-12T05:51:52.000Z

Here is the pseudo code I used in my implementation. Just for your renference.

def pyramid_noise_like(x, timesteps, discount):
    b, c, w_ori, h_ori = x.shape 
    u = nn.Upsample(size=(w_ori, h_ori), mode='bilinear')
    noise = torch.randn_like(x)
    for i in range(N):
        r = np.random.random()*scale + scale 
        w, h = max(1, int(w_ori/(r**i))), max(1, int(h_ori/(r**i)))
        noise += u(torch.randn(b, c, w, h).to(x)) * (timesteps[...,None,None,None]/1000) * discount**i
        if w==1 or h==1: break # Lowest resolution is 1x1
    return noise/noise.std() # Scaled back to roughly unit variance