Changing init learning rate

Question

Changing init learning rate

Kraut-Inferences opened this issue 3 years ago · 2 comments

Does modifying the initial learning rate hurt the algorithm in any way? Wanting to use exponential decay but don't know if it would improve the performance.

Answer 1 · 2021-07-24T14:15:15.000Z

From my experience with a ViT model on ImageNet, AdaBelief improves over Adam when both use a default cosine learning rate. I think it should work with other models.

Answer 2 · 2021-07-24T20:18:43.000Z

thank you.