This is the official PyTorch implementation for "Mesa: A Memory-saving Training Framework for Transformers".
Primary LanguagePythonApache License 2.0Apache-2.0