DisTrO

This is the repository for DisTrO, a family of architecture-agnostic and network-agnostic distributed optimizers that reduce inter-GPU communication requirements by four to five orders of magnitude without relying on amortized analysis, enabling low-latency training of large neural networks on slow internet bandwidths with heterogeneous networking hardware.

Aug. 26th, 2024: Preliminary Report
Coming Soon: Paper and Code
In The Near Future: 👀

Join us on Discord if you're interested in helping research and build the future of distributed training.

lee-b/DisTrO

DisTrO