/adamw-bf16

AdamW for bfloat16 weights with stochastic rounding

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

Stargazers