Pinned Repositories
benchmarks
A benchmark framework for Tensorflow
byteps
A high performance and generic framework for distributed DNN training
horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and MXNet.
imagenet18
Train ImageNet in 18 minutes on AWS
incubator-mxnet
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
lightseq
LightSeq: A High Performance Library for Sequence Processing and Generation
ns3-rdma
NS3 simulator for RDMA over Converged Ethernet v2 (RoCEv2), including the implementation of DCQCN, TIMELY, PFC, ECN and shared buffer switch
pfcdeadlock
byteps
A high performance and generic framework for distributed DNN training
Freeflow
High performance container overlay networks on Linux. Enabling RDMA (on both InfiniBand and RoCE) and accelerating TCP to bare metal performance. Freeflow requires zero modification on application code/binary.
bobzhuyb's Repositories
bobzhuyb/ns3-rdma
NS3 simulator for RDMA over Converged Ethernet v2 (RoCEv2), including the implementation of DCQCN, TIMELY, PFC, ECN and shared buffer switch
bobzhuyb/pfcdeadlock
bobzhuyb/benchmarks
A benchmark framework for Tensorflow
bobzhuyb/byteps
A high performance and generic framework for distributed DNN training
bobzhuyb/horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and MXNet.
bobzhuyb/imagenet18
Train ImageNet in 18 minutes on AWS
bobzhuyb/incubator-mxnet
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
bobzhuyb/lightseq
LightSeq: A High Performance Library for Sequence Processing and Generation
bobzhuyb/xla
Enabling PyTorch on Google TPU