What do you think about diffgrad?
hadaev8 opened this issue · 3 comments
hadaev8 commented
This is a greate repo, my respect.
JRC1995 commented
Thanks.
I haven't read the paper on diffgrad. Abstract looks interesting.
hadaev8 commented
Also, what do you think about weight decay in style of adamw and gradient norming? Will I break anything with gradient norming?
JRC1995 commented
It should be good.