yahoo/open_nsfw

Msra vs Xavier

ryanjay0 opened this issue · 1 comments

I've noticed the only difference between the default resnet50_1by2 and your implementation (besides the number of classes) is the change from weight_filler mrsa to xavier, and bias filler from constant to xavier in the InnerProduct layer.

Was there a reason for that change? Maybe the small number of classes? Did it make a big difference?

I am assuming default resnet501by2 is the one mentioned here. The initialization while finetuning does not make much difference while finetuning since only the params of last layer (FC_nsfw) are initialized , and rest are loaded from pretrained model. Effect of initialization while training on imagenet is more significant and you can refer to corresponding papers for more details