/Gradient-Compression

We present a set of all-reduce compatible gradient compression algorithms which significantly reduce the communication overhead while maintaining the performance of vanilla SGD. We empirically evaluate the performance of the compression methods by training deep neural networks on the CIFAR10 dataset.

Primary LanguagePythonMIT LicenseMIT

Watchers