/Canary

A Decentralized Distributed DL Architecture in Multi-interface Network.

Primary LanguagePythonApache License 2.0Apache-2.0

Canary

A Decentralized Distributed DL Architecture in the Multi-interface Network.

Prerequisites

Overview

There are three main folders:

  • model: the models used in DL training, incluing common CNN, AlexNet, VGG19, Inception-V1/V3 and ResNeXt101/152.
  • 8bit_quantization: data quantization by transferring numbers from FP32 into INT8 format.
  • QAT_for_FP_BP: The 8-bit Quantization-aware-Training (QAT) for both forward (papameter quantization) and backward (gradient sketch) propagation stages.
  • topology: PS on FatTree and AllReduce on BCube.

Dataset

Three classical datasets are supported: MNIST, Fashion MNIT, and CIFAR-10.