/Megatron-Kwai

[USENIX ATC '24] Accelerating the Training of Large Language Models using Efficient Activation Rematerialization and Optimal Hybrid Parallelism

Primary LanguagePythonOtherNOASSERTION

Stargazers