Implementing the transformer architecture from scratch
Primary LanguageJupyter NotebookMIT LicenseMIT