Code for the ICML'20 paper "Improving Transformer Optimization Through Better Initialization"
Primary LanguagePythonMIT LicenseMIT
No issues in this repository yet.