Error in the implementation of ActNormalization
mjack3 opened this issue · 6 comments
In the latest pull Request, the ActNorm layer was modified by adding parameters per channel, height, and width. However, if we attend to the original paper:
"We propose an actnorm layer (for activation normalizaton), that performs an affine transformation of the activations using a scale and bias parameter per channel, similar to batch normalization".
This can also be checked using its official code. To sum up, to me the previous implementation was the correct one.
The All-In-One Block did not rely on the ActNorm block that is used in Invertible ResNet. As far as I know, refactoring the All-In-One Block is not something we are considering right now, but this may be an objective in the future.
Sorry @LarsKue, although the AIO block does not use the ActNorm in Invertible Resnet, they are based on the same principle. Both might be wrongly implemented (ActNorm already solved, thanks). If not, could you briefly explain the differences between Actnorm and the normalization block in AIO?
Regards
The incorrect implementation in ActNorm was a result of a recent rework. The AIO Block uses the following code to instantiate its global affine (actnorm) parameters with the correct shape:
self.global_scale = nn.Parameter(torch.ones(1, self.in_channels, *([1] * self.input_rank)) * float(global_scale))
self.global_offset = nn.Parameter(torch.zeros(1, self.in_channels, *([1] * self.input_rank)))
Therefore, the issue in fixed in #167 does not apply to AIO.