ShengcaiLiao/QAConv

self.model.eval()

xiaopanchen opened this issue · 2 comments

Recently, I have read your code for QAConv. Now, I have a question to consult you. In the train() method in trainer.py, the following codes
class BaseTrainer(object):
for i, inputs in enumerate(data_loader):
self.model.eval()
self.criterion.train()
Why you don't set the model in train mode by using self.mode.train(), instead of using model.eval(). And, in the whole code of your project, I also found that there is no other place to use model. train().

Hi, thanks for the interest. A good question; I'm glad you notice this. This is newly included. Generally, model.eval() only affects the running states of dropout layers (which we do not use) and BN layers. I use pretrained BN layers from ImageNet, and recently I find that keeping the pretrained running states of BN layers helps generalization of the learned model, with about 1%-2% improvement in cross-dataset evaluation. This is probably because there is a BN drift problem in re-id training on datasets of limited scale, while ImageNet is large and general. Therefore, I decide freezing the BN layers for a better generalizability.

Hi, thanks for the interest. A good question; I'm glad you notice this. This is newly included. Generally, model.eval() only affects the running states of dropout layers (which we do not use) and BN layers. I use pretrained BN layers from ImageNet, and recently I find that keeping the pretrained running states of BN layers helps generalization of the learned model, with about 1%-2% improvement in cross-dataset evaluation. This is probably because there is a BN drift problem in re-id training on datasets of limited scale, while ImageNet is large and general. Therefore, I decide freezing the BN layers for a better generalizability.

Thank you very much. I see.