cofe-ai/Mu-scaling
Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales
Python
No issues in this repository yet.
Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales
Python
No issues in this repository yet.