/AdasOptimizer

ADAS is short for Adaptive Step Size, it's an optimizer that unlike other optimizers that just normalize the derivative, it fine-tunes the step size, truly making step size scheduling obsolete, achieving state-of-the-art training performance

Primary LanguageC++MIT LicenseMIT

Watchers