yandex-research/swarm
Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"
Python
Issues
- 0
- 0
Backwards fault recovery
#2 opened by NikolayBlagoev - 2
Amazing work !
#1 opened by tchaton
Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"
Python