/Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Primary LanguageJupyter NotebookOtherNOASSERTION

Watchers

No one’s watching this repository yet.