cliang1453/SAGE
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)
PythonMIT
Stargazers
- ag027592https://biic.ee.nthu.edu.tw
- ajpppSingapore
- allanjSalesforce Research
- by2101
- CheeseTurtle
- ClaudiaShuUniversity College London
- cliang1453@microsoft @gatech
- fly51flyPRIS
- GanjinZeroDAMO Academy
- jdposada
- JeffCarpenterCanada
- jinmingteoSingapore
- JoesSattesbangkok
- kaishxuHong Kong
- limbercOxford, UK
- mingboizSprinklr
- monatis@qdrant
- namisanMSR
- NamlessM
- PeacePeaceHan
- saroyehun
- tourzhao
- Vbansal21None
- WenzhengZhang
- wwangdg
- xuanhan863Los Angeles, USA
- yucornettoJohns Hopkins University
- zhhongzhi
- zwhe99Shanghai Jiao Tong University