szq0214/DSOD

Transition w/o Pooling Layer size mismatch

qianyizhang opened this issue · 1 comments

It seems your model graph is inconsistent with the paper (Table1 Output Size) for the Transition w/o Pooling Layer (1+2)
In the paper:
Transition w/o Pooling Layer (1) channel = 1120
Transition w/o Pooling Layer (2) channel = 1568
In the model graph:
Convolution49 num output = 1184
Convolution66 num output = 256

Also, I don't quite understand of the purpose of Transition w/o Pooling Layer (1), you don't actually compress nor expand its filter number (num input = num output), and you don't branch it out for prediction. By removing it (Convolution49 + BN 50 + ReLU50) you would have a compact Dense Block (3+4) with 8 x 2 = 16 dense layers. So what's the reason to explicitly inject such extra (BN+ReLU+1x1Conv) block in between?

Hi @qianyizhang, you are right. 1120 is a minor mistake and it should be 1184. We will correct it in a new version. In our original definition, transition w/o pooling layer has the same input and output channel size. So it's very hard to define '256' since we want to downsample the channels, meanwhile, avoid using two continuous 1x1 conv layers. We will clarify this. For the purpose of transition w/o pooling layer (1), we tried the ablation experiment by removing it with more dense layers in a single dense block, but the performance is degraded. We conjecture this layer can help combine new learned features and old features and is a very important component in our structure. Maybe you can try it by yourself. BTW, thanks very much for pointing out these to help us improve the paper. :)