AdamW for bfloat16 weights with stochastic rounding
Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0