Do I need to tune learning rates?

Question

Do I need to tune learning rates?

xuzhang5788 opened this issue 5 years ago · 4 comments

Thank you so much for your great implementation.
Do I need to add a callback like ReduceLROnPlateau? Can I combine RAdam and AdamW(Adam with weight decay) together? How about using RAdam with one-cycle-policy?

Answer 1 · 2019-08-17T00:39:21.000Z

@xuzhang5788 About the callback, I believe so. Check the ~/tests/test_optimizers.py file: ReduceLROnPlateau is being called in model.fit().

Answer 2 · 2019-08-17T01:41:15.000Z

@pedromlsreis Thank you for your reply.
In the paper, it said that RAdam can dynamic adjustment to the adaptive learning rate. Why should we schedule the learning rate decay?

Answer 3 · 2019-08-17T22:08:53.000Z

@xuzhang5788 oh okay. I can't answer you as now I've got the same doubt :)

Answer 4 · 2019-08-18T02:11:03.000Z

See the guide in the official repo:

Directly replace the vanilla Adam with RAdam without changing any settings

It's just a feature that should be implemented.