JingzhaoZhang/why-clipping-accelerates
A pytorch implementation for the LSTM experiments in the paper: Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity
PythonBSD-3-Clause
Stargazers
- akankshaaa03
- AkramzMicrosoft
- APodolskiyMoscow
- BaseBlank
- bhattgUniversity of Washington
- danielcieslinskiNational Centre for Nuclear Research
- dobribanThe Wharton School, University of Pennsylvania
- dukebwHamilton, Ontario, Canada
- EnnengYangNortheastern University, China
- fly51flyPRIS
- HaneolJang
- JakubCzarlinski
- JianghanxiaoColumbia University
- lbinmeng
- leiwu0PKU
- liuqi8827Harbin Institute of Technology
- lliai
- niklausliuMonash University
- nphard001
- orzqqqqqqq
- princenimoEarth
- ShayekhBinIslamDhaka, Bangladesh
- sheepc
- stjordanisGreece
- sungyubkim
- tanyapohn
- vv111yNiagara Falls
- wanghaoxx99
- WatsonWangZhPeking University
- weixin00
- xiao2motrio.ai
- XrosLiang
- yilunliaoMIT
- zhly0
- zliangak
- zrl4836