ChunML/ssd-tf2

There is only one BN in network, is that useful?

Closed this issue · 1 comments

In my opinion, there is no information about batch normalization in paper. And there is only first feature layer used this operation in code. Is that useful ? Can we use it for every feature layer before compute_heads ?

conf, loc = self.compute_heads(self.batch_norm(x), head_idx)

In the paper, the authors stated:

Since, as pointed out in [12], conv4 3 has a different feature
scale compared to the other layers, we use the L2 normalization technique introduced
in [12] to scale the feature norm at each location in the feature map to 20 and learn the
scale during back propagation.

Because there weren't any equivalent implementations for Tensorflow, I just used a BatchNorm layer instead. I tried to implement L2 normalization layer on my own, but the loss didn't converge as expected.