encoder and decoder separate during training?

Question

encoder and decoder separate during training?

hydxqing opened this issue 6 years ago · 4 comments

Hello, I have a question.
Is encoder and decoder separate during training? Is it not end to end training?

Answer 1 · 2018-12-27T10:57:48.000Z

Hi, to improve gradient propagation through the encoder we train them separately, first encoder and then attach decoder and continue. But you can train end-to-end from scratch and you will get a slightly lower result, if you use significantly more epochs (e.g. 300) the result should be similar to separate training since there should be enough updates in the encoder's weights

Answer 2 · 2018-12-28T03:35:51.000Z

@Eromera How can I train it from scrath without the pretrained imagenet model ? Just use the well-trained model_encoder_best.pth ?

Answer 3 · 2018-12-29T10:51:32.000Z

@MrLinNing Yes you can use the trained model_encoder_best.pth by training it in an only-encoder stage (using flags --decoder and --pretrainedEncoder "model_encoder_best.pth") or you can train the full network end to end without pretraining encoder (just use --decoder flag). If you do this second option, train significantly more epochs to make the encoder receive more weight updates (~300)

Answer 4 · 2020-01-27T10:25:58.000Z

Hi, @Eromera, Thanks for your great works.
There is one thing that I can not understand well. When you train without pretraining encoder, how would you initialize the parameters of layers? I noticed that there is a function "weight_init" defined, have you used this when training from scratch?