Activation in `BasicConv2d`
Closed this issue · 1 comments
Hi
I've noticed that in a heavy-utilized BasicConv2d
block you create ReLU
layer during the block initialization, yet never use it in the actual forward
pass. Since this is the most basic block of your network, it decreases the number of activations significantly.
In some other places, for example, in reverse attention
branches of PraNet
, you call the F.relu
manually; yet, multiple other places (like the entirity of aggregation) are activation-less.
Is that a bug or a feature? If the latter is the case, did you researched the impact of those linear layers compared to the usual ReLU
?
Sergey
Hi, @SergeyTsimfer!
First, thank you for your attention to our work.
For conducting the abaltion studies, we defined the RuLU
non-linear function in the BasicConv2d
function before. But, we found it only slightly impacts the overall performance of our PraNet. So we didn't discuss it in our original paper. In other words, it is empirically.
You can try more variants based on our design and tell me the result you conducted via my E-mial (gepengai.ji@gmail.com).
Best regards,
Ge-Peng.