/IJCAI2023-OptimalShardedDataParallel

[IJCAI2023] An automated parallel training system that combines the advantages from both data and model parallelism. If you have any interests, please visit/star/fork https://github.com/Youhe-Jiang/OptimalShardedDataParallel

Primary LanguagePythonMIT LicenseMIT

Watchers