ajbrock/BoilerPlate

ReLU after residual branch missing?

f90 opened this issue · 1 comments

f90 commented

In the paper you can see the diagram of Fixup which applies the residual branch, and adds the output to the original input. Finally it applies a ReLU to this sum.

https://i.stack.imgur.com/T67F3.png

In the code I cannot see this being applied anywhere (once for each block) - is this an oversight or am i missing something?

f90 commented

More specifically, why is it

return self.residual(x) + self.shortcut(x)

in https://github.com/ajbrock/BoilerPlate/blob/master/Models/fixup.py#L45

and not

return self.activation(self.residual(x) + self.shortcut(x))