/gradient-checkpointing

Make huge neural nets fit in memory

Primary LanguagePythonMIT LicenseMIT

Stargazers