Why the total loss increase suddenly and rapidly when training the base crf ?
DtYXs opened this issue · 1 comments
train crf
bert lr 2e-5
other lr 2e-3
epoch 10
The total loss decrease in the first 5 epoch. The loss is about 20.
But suddenly the loss increase rapidly to about 10000.
Use 'get_linear_schedule_with_warmup' the lr should be smaller and smaller. But this phenomenon seems to disappear when I change the schedule. I don't understand why the loss increases.
train crf
bert lr 2e-5
other lr 2e-3
epoch 10The total loss decrease in the first 5 epoch. The loss is about 20.
But suddenly the loss increase rapidly to about 10000.Use 'get_linear_schedule_with_warmup' the lr should be smaller and smaller. But this phenomenon seems to disappear when I change the schedule. I don't understand why the loss increases.
Hello, I think the fundamental reason of this phenomenon is that the other learning rate 2e-3 is too high, you may use the same schedule with 2e-5. As for why a smaller lr what you said will appear this phenomenon, I think the local gradient is also increase suddenly.