About the optimisation learning rates
VConchello opened this issue · 2 comments
VConchello commented
What was the criterion used to choose the learning rates on core.py:118-136?
It looks like it alternates between increasing and decreasing during the iterations?
And at the beginning the learning rate is increasing?
yzhao062 commented
"In METAOD, we employ two strategies that help stabilize the training. First, we leverage meta-feature based (rather than random) initialization. Second, we use cyclical learning rates that help escape saddle points for better local optima [43]."
[43] L. N. Smith. Cyclical learning rates for training neural networks. In WACV, pages 464–472. IEEE Computer Society, 2017.
we do use this technique for better training :)
VConchello commented
Mmmh, I see. Thank you for answering so fast and clear :)