The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
Primary LanguagePythonMIT LicenseMIT
No one’s watching this repository yet.