Correct way to use BatchNorm layer

Question

Correct way to use BatchNorm layer

happynear opened this issue 8 years ago · 1 comments

Thanks for your scripts! I find a small mistake in your network definition. For the testing phase, we should use the moving average mean/variance instead them of the mini-batch. Actually, the default setting of the official implementation just works in the correct way.

A correct BatchNorm layer (and scale layer after it) should be:

layer {
  name: "first_conv_bn"
  type: "BatchNorm"
  bottom: "first_conv"
  top: "first_conv"
  param {
    lr_mult: 0
    decay_mult: 0
  }
  param {
    lr_mult: 0
    decay_mult: 0
  }
  param {
    lr_mult: 0
    decay_mult: 0
  }
}
layer {
  name: "first_conv_scale"
  type: "Scale"
  bottom: "first_conv"
  top: "first_conv"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 1
    decay_mult: 0
  }
  scale_param {
    bias_term: true
  }
}

Answer 1 · 2017-01-17T04:15:58.000Z

@happynear After reading caffe sources, I think you are right. I've fixed it. Just comment out this
)