Pinned Repositories
caffe-optimized
dlbench
Benchmarking State-of-the-Art Deep Learning Software Tools
adahessian
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
B-Caffe
MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms
eva
[ICLR 2023] Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation
gtopkssgd
gTop-k S-SGD: A Communication-Efficient Distributed Synchronous SGD Algorithm for Deep Learning
kfac_pytorch
Distributed K-FAC Preconditioner for PyTorch
webrtc-ios
webrtc build on ios
shyhuai's Repositories
shyhuai/B-Caffe
MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms
shyhuai/eva
[ICLR 2023] Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation
shyhuai/gtopkssgd
gTop-k S-SGD: A Communication-Efficient Distributed Synchronous SGD Algorithm for Deep Learning
shyhuai/kfac_pytorch
Distributed K-FAC Preconditioner for PyTorch
shyhuai/adahessian
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
shyhuai/attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
shyhuai/byteps
A high performance and generic framework for distributed DNN training
shyhuai/caffe
Caffe: a fast open framework for deep learning.
shyhuai/DataProvider.torch
Data providers for Torch
shyhuai/ddl-benchmarks-1
ddl-benchmarks: Benchmarks for Distributed Deep Learning
shyhuai/DeepSpeechRecognition
A Chinese Deep Speech Recognition System 包括基于深度学习的声学模型和基于深度学习的语言模型
shyhuai/DistNeRF
A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.
shyhuai/dl_scheduling
shyhuai/FADNet
shyhuai/fiddler
Fast Inference of MoE Models with CPU-GPU Orchestration
shyhuai/GaussianK-SGD
Understanding Top-k Sparsification in Distributed Deep Learning
shyhuai/gblastn
G-BLASTN is a GPU-accelerated nucleotide alignment tool based on the widely used NCBI-BLAST.
shyhuai/horovod
Distributed training framework for TensorFlow, Keras, and PyTorch.
shyhuai/mpi4py
Python bindings for MPI that supports ULFM
shyhuai/nccl
Optimized primitives for collective multi-GPU communication
shyhuai/openai-gemm
Open single and half precision gemm implementations
shyhuai/Optimus-CC
[ASPLOS'23] Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression
shyhuai/pipedream
shyhuai/PySyft
A library for encrypted, privacy preserving deep learning - based on PyTorch
shyhuai/pytorch-sso
PyTorch-SSO: Scalable Second-Order methods in PyTorch
shyhuai/tensorflow
Computation using data flow graphs for scalable machine learning
shyhuai/thundersvm
ThunderSVM: A Fast SVM Library on GPUs and CPUs
shyhuai/training
Reference implementations of MLPerf™ training benchmarks
shyhuai/tutel
Tutel MoE: An Optimized Mixture-of-Experts Implementation
shyhuai/XLearning
AI on Hadoop