digantamisra98/EvoNorm

What is a norm_layer?

dvornikita opened this issue · 4 comments

Hi, thank you for the nice implementation. I've got a question about how you use the EvoNorm S0 in the BasicBlock (your block definition):

        self.evo = EvoNorm2D(planes)
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = conv3x3(planes, planes)
        self.bn2 = norm_layer(planes)

What would be the norm_layer in this case? My understanding of the original paper is that the norm_lyaer(planes) has to actually be EvoNorm2D(planes, non_linear=False). At least that is what I understood from Figure 5's capture and a footnote there (see the original paper).

What is your take on that?

self.evo = EvoNorm2D(planes) is equivalent to self.evo = EvoNorm2D(planes, non_linear = True) . As per the paper, the normalization and activation is applied together in one class of EvoNorm2D, although you have an option to switch it off by passing non_linear = False. My Basic Bllock of ResNet didn't replace all BN + ReLU layers with EvoNorm2D as per the Figure 5 from the paper. I just replaced one BN+ReLU layers with EvoNorm2D. Also the affine transformation that the paper talks about is basically the introduction of Gamma and Beta parameters which are also default in conventional BatchNorm layers.

I see. Thank you for the clarification. Do you have any results on the benchmark tasks to see how the accuracy of your implementation differs from the original one?

Unfortunately I don't have the compute to benchmark my implementation on ImageNet and compare with the original paper's results. However, in my crude CIFAR-10 experiments, EvoNorm S0 did give me +1-2% consistent improvement.

Great. Thank you.