/why-weight-decay

Why Do We Need Weight Decay in Modern Deep Learning? [arXiv, Oct 2023]

Primary LanguagePythonOtherNOASSERTION

Watchers