A minimal example of fine-tuning autoregressive language models with multiple GPUs and DeepSpeed
Primary LanguagePythonMIT LicenseMIT