guotong1988/BERT-GPU
multi-gpu pre-training in one machine for BERT from scratch without horovod (Data Parallelism)
PythonApache-2.0
Stargazers
No one’s star this repository yet.
multi-gpu pre-training in one machine for BERT from scratch without horovod (Data Parallelism)
PythonApache-2.0
No one’s star this repository yet.