CyberZHG/keras-radam

amsgrad parameter

nicolaspanel opened this issue · 3 comments

Hi @CyberZHG and TY for sharing this !
Have you run some experiments with amsgrad=True ?
If so, have you notice significant improvement compared to RAdam+warmup alone ?
Best regards

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

bump, i'm also interested in this @CyberZHG

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.