z814081807/DeepNER

Why the total loss increase suddenly and rapidly when training the base crf ?

DtYXs opened this issue · 1 comments

DtYXs commented

train crf
bert lr 2e-5
other lr 2e-3
epoch 10

The total loss decrease in the first 5 epoch. The loss is about 20.
But suddenly the loss increase rapidly to about 10000.

Use 'get_linear_schedule_with_warmup' the lr should be smaller and smaller. But this phenomenon seems to disappear when I change the schedule. I don't understand why the loss increases.

train crf
bert lr 2e-5
other lr 2e-3
epoch 10

The total loss decrease in the first 5 epoch. The loss is about 20.
But suddenly the loss increase rapidly to about 10000.

Use 'get_linear_schedule_with_warmup' the lr should be smaller and smaller. But this phenomenon seems to disappear when I change the schedule. I don't understand why the loss increases.

Hello, I think the fundamental reason of this phenomenon is that the other learning rate 2e-3 is too high, you may use the same schedule with 2e-5. As for why a smaller lr what you said will appear this phenomenon, I think the local gradient is also increase suddenly.