learning rate for pretraining
LeeDoYup opened this issue · 0 comments
LeeDoYup commented
Hello, thanks for the great project.
I want to know the learning rates, which are used in the pre-training.
When i check the paper, it describes learning rates with BERT or AR objective.
However, when i read the paper,
i understand that the pretraining is conducted with BERT+AR objective, not stand-alone BERT or AR.