About the optimisation learning rates

Question

About the optimisation learning rates

VConchello opened this issue 2 years ago · 2 comments

What was the criterion used to choose the learning rates on core.py:118-136?
It looks like it alternates between increasing and decreasing during the iterations?
And at the beginning the learning rate is increasing?

Answer 1 · 2022-07-07T14:37:58.000Z

"In METAOD, we employ two strategies that help stabilize the training. First, we leverage meta-feature based (rather than random) initialization. Second, we use cyclical learning rates that help escape saddle points for better local optima [43]."

[43] L. N. Smith. Cyclical learning rates for training neural networks. In WACV, pages 464–472. IEEE Computer Society, 2017.

we do use this technique for better training :)

Answer 2 · 2022-07-07T14:46:30.000Z

Mmmh, I see. Thank you for answering so fast and clear :)