/Simple-BERT-RoBERTa-Pretrain

You just need ONE script for the whole pretraining process! HuggingFace reproduction of the BERT/RoBERTa pretraining from scratch, with memory optimization using DeepSpeed and BF16.

Primary LanguagePythonMIT LicenseMIT

Stargazers