snimu/gradient-rounding
Round the gradient during LLM training to different degrees; compare "scaling" of rounding to different significant digits to parameter scaling
PythonApache-2.0
Stargazers
No one’s star this repository yet.
Round the gradient during LLM training to different degrees; compare "scaling" of rounding to different significant digits to parameter scaling
PythonApache-2.0
No one’s star this repository yet.