/why-clipping-accelerates

A pytorch implementation for the LSTM experiments in the paper: Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Watchers