Correct implementation of this model?
mjack3 opened this issue · 9 comments
Hello community.
I am working on implementing a paper which uses this framework. As I 'm new in this tool I'd like to ask to advanced people if am I implementing correctly this architecture.
Firstly, I tried this code:
dims = (256, 64, 64)
inn = Ff.SequenceINN(*dims)
inn.append(Fm.AllInOneBlock, subnet_constructor=subnet_conv_3x3, permute_soft=False)
But i think the AllInOneBlock
does not apply the permutation and the ActNorm in the order that the image shows. So I tried the next code
dims = (256, 64, 64)
inn = Ff.SequenceINN(*dims)
inn.append(Fm.PermuteRandom)
inn.append(Fm.ActNorm)
inn.append(Fm.GLOWCouplingBlock, subnet_constructor=subnet_conv_3x3)
Which one is correct? Moreover, I am getting NaN vector when I'm adding the ``Fm.ActNorm
I think you can get rid of NaN by:
- decreasing clamp_alpha (try 0.1 for example)
- scale incoming features
Edit: Remove the second GLOWCouplingBlock. One of these already depicts the full architecture shown in the image. If you add another one, you again need to add ActNorm layer! I think this is the actual reason why you get NaNs.
@maaft Thanks I'll leave to know.
I did a mistake putting the second GLOWCouplingBlock. I fixed the message.
@maaft I am still getting NaN values in this way
inn = Ff.SequenceINN(*dims)
inn.append(Fm.PermuteRandom)
inn.append(Fm.ActNorm)
inn.append(Fm.GLOWCouplingBlock, subnet_constructor=subnet_conv_3x3, clamp = 1.)
But NaN dissapear if i put the ActNorm after the affine block:
inn = Ff.SequenceINN(*dims)
inn.append(Fm.PermuteRandom)
inn.append(Fm.GLOWCouplingBlock, subnet_constructor=subnet_conv_3x3, clamp = 1.)
inn.append(Fm.ActNorm)
Actually I dont know why it happens
Hi
@maaft I am still getting NaN values in this way
inn = Ff.SequenceINN(*dims) inn.append(Fm.PermuteRandom) inn.append(Fm.ActNorm) inn.append(Fm.GLOWCouplingBlock, subnet_constructor=subnet_conv_3x3, clamp = 1.)
But NaN dissapear if i put the ActNorm after the affine block:
inn = Ff.SequenceINN(*dims) inn.append(Fm.PermuteRandom) inn.append(Fm.GLOWCouplingBlock, subnet_constructor=subnet_conv_3x3, clamp = 1.) inn.append(Fm.ActNorm)
Actually I dont know why it happens
Hello there, did you fix the issue? I met the same issue but I do not know how to fix is XD
@mjack3 thank you for pointing out this bug. The reason you get NaNs is that the mean and standard deviation of ActNorm
are initialized from the first batch of data passed to them. Likely you have at least one channel in your input which always has the same value, leading to 0 standard deviation. Division by 0 is the source of the NaNs. We are working to fix this bug, but in the meantime you can work around it by adding some small noise to your inputs (this is recommended practice anyway, see tutorial).
The AllInOneBlock
doesn't use data-dependent initialization and so doesn't suffer from this error.
You are correct that the AllInOneBlock
does not apply the operations in the same order as the image you want to reproduce. The order is 1) affine half-coupling (not the full coupling shown in your diagram) 2) permutation 3) ActNorm
Oh thanks you @psorrenson Why did you take so long in answering? I thought this repo was abandoned 😃
Yes, sorry about that! The maintenance of the repo has been passed on from the original maintainers. But there was a long period where it was not being maintained. Hopefully we will be a bit faster at answering issues from now on ;)
You will a great job :)