How to judge the convergence of the pre-training model？

Question

Opened this issue 3 years ago · 0 comments

How to measure the loss weight of different pre-training tasks? Which task's loss determines the model training convergence?