Correct implementation of this model?

Question

Correct implementation of this model?

mjack3 opened this issue 3 years ago · 9 comments

Hello community.

I am working on implementing a paper which uses this framework. As I 'm new in this tool I'd like to ask to advanced people if am I implementing correctly this architecture.

Firstly, I tried this code:

dims = (256, 64, 64)

inn = Ff.SequenceINN(*dims)
inn.append(Fm.AllInOneBlock, subnet_constructor=subnet_conv_3x3, permute_soft=False)

But i think the AllInOneBlock does not apply the permutation and the ActNorm in the order that the image shows. So I tried the next code

dims = (256, 64, 64)

inn = Ff.SequenceINN(*dims)
inn.append(Fm.PermuteRandom)
inn.append(Fm.ActNorm)
inn.append(Fm.GLOWCouplingBlock, subnet_constructor=subnet_conv_3x3)

Which one is correct? Moreover, I am getting NaN vector when I'm adding the ``Fm.ActNorm

mjack3 commented 2 years ago

#121

Answer 1 · 2022-03-10T14:11:22.000Z

I think you can get rid of NaN by:

decreasing clamp_alpha (try 0.1 for example)
scale incoming features

Edit: Remove the second GLOWCouplingBlock. One of these already depicts the full architecture shown in the image. If you add another one, you again need to add ActNorm layer! I think this is the actual reason why you get NaNs.

Answer 2 · 2022-03-10T14:19:07.000Z

@maaft Thanks I'll leave to know.

I did a mistake putting the second GLOWCouplingBlock. I fixed the message.

Answer 3 · 2022-03-10T14:45:22.000Z

@maaft I am still getting NaN values in this way

inn = Ff.SequenceINN(*dims)
inn.append(Fm.PermuteRandom)
inn.append(Fm.ActNorm)
inn.append(Fm.GLOWCouplingBlock, subnet_constructor=subnet_conv_3x3, clamp = 1.)

But NaN dissapear if i put the ActNorm after the affine block:

inn = Ff.SequenceINN(*dims)
inn.append(Fm.PermuteRandom)
inn.append(Fm.GLOWCouplingBlock, subnet_constructor=subnet_conv_3x3, clamp = 1.)
inn.append(Fm.ActNorm)

Actually I dont know why it happens

Answer 4 · 2022-06-07T01:09:11.000Z

Hi

@maaft I am still getting NaN values in this way
inn = Ff.SequenceINN(*dims)
inn.append(Fm.PermuteRandom)
inn.append(Fm.ActNorm)
inn.append(Fm.GLOWCouplingBlock, subnet_constructor=subnet_conv_3x3, clamp = 1.)
But NaN dissapear if i put the ActNorm after the affine block:
inn = Ff.SequenceINN(*dims)
inn.append(Fm.PermuteRandom)
inn.append(Fm.GLOWCouplingBlock, subnet_constructor=subnet_conv_3x3, clamp = 1.)
inn.append(Fm.ActNorm)
Actually I dont know why it happens

Hello there, did you fix the issue? I met the same issue but I do not know how to fix is XD

Answer 5 · 2022-06-22T14:43:56.000Z

@mjack3 thank you for pointing out this bug. The reason you get NaNs is that the mean and standard deviation of ActNorm are initialized from the first batch of data passed to them. Likely you have at least one channel in your input which always has the same value, leading to 0 standard deviation. Division by 0 is the source of the NaNs. We are working to fix this bug, but in the meantime you can work around it by adding some small noise to your inputs (this is recommended practice anyway, see tutorial).

The AllInOneBlock doesn't use data-dependent initialization and so doesn't suffer from this error.

You are correct that the AllInOneBlock does not apply the operations in the same order as the image you want to reproduce. The order is 1) affine half-coupling (not the full coupling shown in your diagram) 2) permutation 3) ActNorm

Answer 6 · 2022-06-22T16:18:31.000Z

Oh thanks you @psorrenson Why did you take so long in answering? I thought this repo was abandoned 😃

Answer 7 · 2022-06-22T16:33:42.000Z

Yes, sorry about that! The maintenance of the repo has been passed on from the original maintainers. But there was a long period where it was not being maintained. Hopefully we will be a bit faster at answering issues from now on ;)

Answer 8 · 2022-06-22T16:45:29.000Z

You will a great job :)