szagoruyko/attention-transfer

Why not use bn for teacher net in imagenet.py

Opened this issue · 2 comments

Thanks for your great work first!

I wonder why you do not use BN layer when inference the teacher model here( https://github.com/szagoruyko/attention-transfer/blob/master/imagenet.py#L117 )? Is it a typo?

Hope for your reply!

I have the same question 0.0

Hi, I think i have known why there are no BN layers in teacher structure,

"Folded Models below have batch_norm parameters and statistics folded into convolutional layers for speed. It is not recommended to use them for finetuning."

url here