aharley/simple_bev

Questions on architecture design choices

Closed this issue · 2 comments

Kait0 commented

Hi I have a couple of question regarding your NN architecture design and I would like to ask if you could give the the motivation for these particular design choices (or if they are copied from some other work point me to it):

For both ResNet backbones you stopped at the 3rd block and did not use the 4th block:

self.backbone = nn.Sequential(*list(resnet.children())[:-4])

self.layer3 = backbone.layer3

What is the motivation for the use of instance normalization in the decoders?

nn.InstanceNorm2d(shared_out_channels),

Why did you not use activation functions for the up-sampling layers in the BEV grid?

class UpsamplingAdd(nn.Module):

Great questions.

3rd block: I think we did this so that the resolutions would match up with the EfficientNet versions (which came from FIERY).

Instance norm: This probably makes only a tiny difference in practice. In general I prefer instancenorm over batchnorm because instancenorm is friendlier to experiments with low batch sizes.

No activation in upsampling: copied from FIERY: https://github.com/wayveai/fiery/blob/master/fiery/layers/convolutions.py

Kait0 commented

Thanks.