Pytorch data distributed parallel (DDP) implementation of a text-based transformer model from scratch.
Primary LanguagePython