exponential lr
Robert-Hopkins opened this issue · 1 comments
Robert-Hopkins commented
as epoch goes up, why does the lr increase instead of decrease?
such as:
3.1095e-06
6.2189e-06
9.3284e-06
1.2438e-05
1.5547e-05
...
aaron38 commented
Do you use the warmup strategy for training? During the warmup epochs, the learning rate gradually increases, to prevent the gradient from exploding. After the warmup, the lr should stay at the same value