attesaarela/Sophia
Copy of the official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
PythonMIT
Watchers
No one’s watching this repository yet.
Copy of the official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
PythonMIT
No one’s watching this repository yet.