ReLU after residual branch missing?
f90 opened this issue · 1 comments
f90 commented
In the paper you can see the diagram of Fixup which applies the residual branch, and adds the output to the original input. Finally it applies a ReLU to this sum.
https://i.stack.imgur.com/T67F3.png
In the code I cannot see this being applied anywhere (once for each block) - is this an oversight or am i missing something?
f90 commented
More specifically, why is it
return self.residual(x) + self.shortcut(x)
in https://github.com/ajbrock/BoilerPlate/blob/master/Models/fixup.py#L45
and not
return self.activation(self.residual(x) + self.shortcut(x))