/DisTrO

Distributed Training Over-The-Internet

DisTrO

This is the repository for DisTrO, a family of architecture-agnostic and network-agnostic distributed optimizers that reduce inter-GPU communication requirements by four to five orders of magnitude without relying on amortized analysis, enabling low-latency training of large neural networks on slow internet bandwidths with heterogeneous networking hardware.

  • Aug. 26th, 2024: Preliminary Report
  • Coming Soon: Paper and Code
  • In The Near Future: 👀

Join us on Discord if you're interested in helping research and build the future of distributed training.