CyberZHG/keras-radam

epsilon not compatible with Adam

KarlisFre opened this issue · 1 comments

In this implementation epsilon is used inside the sqrt, but in Adam it is used outside. To switch from Adam to RAdam one should use epsilon^2 of the value used before. Failing to do so may result in bad performance. Please implement epsilon the same way as in Adam or at least clearly note the difference.

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.