What's is the proper loss during training?
Closed this issue · 3 comments
fyting commented
For me, the training loss is about 11.5 in the beginning, and 9.5 in the ending, is it reasonable?
funnyzhou commented
In our experiments, the loss value does not mean a lot. You have to finetune the checkpoint to check the effect of pretraining.
fyting commented
In the case of using 700k samples, how many epoches of training are needed to achieve the best performance?
funnyzhou commented
I used 150 epochs when the batch size is 256.
wangxinliang <notifications@github.com> 于2021年1月12日周二 下午6:52写道:
… In the case of using 700k samples, how many epoches of training are needed
to achieve the best performance?
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADGC7DGVWZUW2H7ONE32KULSZQSWFANCNFSM4VNFZZ5A>
.