/Sophia

Copy of the official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”

Primary LanguagePythonMIT LicenseMIT

Watchers

No one’s watching this repository yet.