cliang1453/SAGE
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)
PythonMIT
Issues
- 2
Reproduction of machine translation results
#1 opened by zwhe99
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)
PythonMIT