请问预训练的schedule是怎么设置的
NinedayWang opened this issue · 1 comments
NinedayWang commented
请问训练base和large模型时,学习率和warmup等分别是怎么设置的?
Ag2S1 commented
训练参数设置上我们参考了 LAMB:
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
NinedayWang opened this issue · 1 comments
请问训练base和large模型时,学习率和warmup等分别是怎么设置的?
训练参数设置上我们参考了 LAMB:
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes