What is a norm_layer?
dvornikita opened this issue · 4 comments
Hi, thank you for the nice implementation. I've got a question about how you use the EvoNorm S0 in the BasicBlock (your block definition):
self.evo = EvoNorm2D(planes)
self.relu = nn.ReLU(inplace=True)
self.conv2 = conv3x3(planes, planes)
self.bn2 = norm_layer(planes)
What would be the norm_layer in this case? My understanding of the original paper is that the norm_lyaer(planes) has to actually be EvoNorm2D(planes, non_linear=False). At least that is what I understood from Figure 5's capture and a footnote there (see the original paper).
What is your take on that?
self.evo = EvoNorm2D(planes)
is equivalent to self.evo = EvoNorm2D(planes, non_linear = True)
. As per the paper, the normalization and activation is applied together in one class of EvoNorm2D, although you have an option to switch it off by passing non_linear = False
. My Basic Bllock of ResNet didn't replace all BN + ReLU layers with EvoNorm2D as per the Figure 5 from the paper. I just replaced one BN+ReLU layers with EvoNorm2D. Also the affine transformation that the paper talks about is basically the introduction of Gamma and Beta parameters which are also default in conventional BatchNorm layers.
I see. Thank you for the clarification. Do you have any results on the benchmark tasks to see how the accuracy of your implementation differs from the original one?
Unfortunately I don't have the compute to benchmark my implementation on ImageNet and compare with the original paper's results. However, in my crude CIFAR-10 experiments, EvoNorm S0 did give me +1-2% consistent improvement.
Great. Thank you.