/gpt-2

Train 774M, 1.5B models with the Google's S3 optimizer

Primary LanguagePythonMIT LicenseMIT

Watchers