Pre-training a Transformer from scratch.
Primary LanguageJupyter NotebookApache License 2.0Apache-2.0