facebookresearch/CodeGen

Question for model training

runningmq opened this issue · 1 comments

Hi,
I'm trying to train the models from scratch.
One question for Unsupervised Translation of Programming Languages :

as my understand, there are 3 steps:
step 1: train the xlm model on all python/java/cpp codes
step 2: initialize the encoder and decoder parameters from step1 and train the DAE unsupervised task on code functions.
step 3: back-translation step

So the final models are generate by above 3 steps ? or the step 2 and step 3 are trained at the same time?

Thanks

brozi commented

We do step 2 and 3 at the same time. We just use a schedule for the learning rate of the DAE steps, decreasing it smoothly over time to give more importance to the back translation.