google-deepmind/optax

Reccomendation for the `eps_root` setting for differentiating through the Adam optimizer

itk22 opened this issue · 2 comments

Dear Optax team,
I am working on implementing Model-Agnostic Meta-Learning in my project, and I noticed that setting the inner loop optimizer to the default Adam optimizer in Optax results in nan values in the meta-gradients. This is covered well in the documentation, which mentions that the eps_root should be set to a small constant to avoid dividing by zero when rescaling. Could you please recommend a good default value for eps_root in a meta-learning scenario?

Hi Igor, thanks for reaching out! 1e-8 might be a suitable choice for this case.

Thank you, I will keep on using this for my experiments then :)