/why-weight-decay

Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]

Primary LanguagePythonOtherNOASSERTION

Stargazers