rosinality/sagan-pytorch

Why so many ConvBlock(512, 512 ???

Opened this issue · 2 comments

    self.conv = nn.ModuleList([ConvBlock(512, 512, n_class=n_class),
                               ConvBlock(512, 512, n_class=n_class),
                               ConvBlock(512, 512, n_class=n_class,
                                         self_attention=True),
                               ConvBlock(512, 256, n_class=n_class),
                               ConvBlock(256, 128, n_class=n_class)])

Deeper & wider network gave better results. As conv block has only 1 conv module, network is not very deep.

It's even better to use more ConvBlocks, but fewer filters. Deeper nets tend to learn better features. I would say the current model architecture is too wide, which leads to many "dead" kernels