/GPT2_from_scratch

An end to end implementation of OpenAI's GPT2 model with training on Fineweb dataset and evaluations using Hellaswag.

Primary LanguagePythonApache License 2.0Apache-2.0

Watchers