Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)
Primary LanguagePythonMIT LicenseMIT