Eromera/erfnet_pytorch

encoder and decoder separate during training?

hydxqing opened this issue · 4 comments

Hello, I have a question.
Is encoder and decoder separate during training? Is it not end to end training?

Hi, to improve gradient propagation through the encoder we train them separately, first encoder and then attach decoder and continue. But you can train end-to-end from scratch and you will get a slightly lower result, if you use significantly more epochs (e.g. 300) the result should be similar to separate training since there should be enough updates in the encoder's weights

@Eromera How can I train it from scrath without the pretrained imagenet model ? Just use the well-trained model_encoder_best.pth ?

@MrLinNing Yes you can use the trained model_encoder_best.pth by training it in an only-encoder stage (using flags --decoder and --pretrainedEncoder "model_encoder_best.pth") or you can train the full network end to end without pretraining encoder (just use --decoder flag). If you do this second option, train significantly more epochs to make the encoder receive more weight updates (~300)

Hi, @Eromera, Thanks for your great works.
There is one thing that I can not understand well. When you train without pretraining encoder, how would you initialize the parameters of layers? I noticed that there is a function "weight_init" defined, have you used this when training from scratch?