data-parallelism
There are 48 repositories under data-parallelism topic.
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
cerndb/dist-keras
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
mratsim/weave
A state-of-the-art multithreading runtime: message-passing based, fast, scalable, ultra-low overhead
PaddlePaddle/PaddleFleetX
飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。
Oneflow-Inc/libai
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
alibaba/EasyParallelLibrary
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
dkeras-project/dkeras
Distributed Keras Engine, Make Keras faster with only one line of code.
wenwei202/terngrad
Ternary Gradients to Reduce Communication in Distributed Deep Learning (TensorFlow)
vertexclique/orkhon
Orkhon: ML Inference Framework and Server Runtime
xrsrke/pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
kuixu/keras_multi_gpu
Multi-GPU training for Keras
hkproj/pytorch-transformer-distributed
Distributed training (multi-node) of a Transformer model
NERSC/sc23-dl-tutorial
SC23 Deep Learning at Scale Tutorial Material
ryantd/veloce
WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.
daekeun-ml/sm-distributed-training-step-by-step
This repository provides hands-on labs on PyTorch-based Distributed Training and SageMaker Distributed Training. It is written to make it easy for beginners to get started, and guides you through step-by-step modifications to the code based on the most basic BERT use cases.
yangyang14641/Parallel-Matrix-Multiplication-FOX-Algorithm
:coffee:Implement of Parallel Matrix Multiplication Methods Using FOX Algorithm on Peking University's High-performance Computing System
namhoonlee/effect-dps-public
Understanding the effects of data parallelism and sparsity on neural network training
dscpesu/NetTorrent
A decentralized and distributed framework for training DNNs
itzmeanjan/merklize-blake3
OpenCL powered Merklization using BLAKE3
Oblomov/cldpp
OpenCL Data Parallel Primitives
zbjob/DiscoPoP
Dependence-Based Code Transformation for Coarse-Grained Parallelism
explcre/pipeDejavu
pipeDejavu: Hardware-aware Latency Predictable, Differentiable Search for Faster Config and Convergence of Distributed ML Pipeline Parallelism
LER0ever/HPGO
Development of Project HPGO | Hybrid Parallelism Global Orchestration
plerros/helsing
A mostly POSIX-compliant utility that scans a given interval for vampire numbers.
sjlee25/batch-partitioning
Batch Partitioning for Multi-PE Inference with TVM (2020)
AnveshaM/Enhancing-performance-of-big-data-machine-learning-models-on-Google-Cloud-Platform
The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.
ashayp22/monte-carlo-options-simd
SIMD multithreaded Monte Carlo options pricer in Rust 🦀
axr6077/Ray-Trace-Parallelization
Complex ray tracing algorithm optimized by using parallelization over different partitioning schemes and explore the performance gains through grain size and processing units (parameters) over sequential algorithm to render a high resolution image.
HiEST/DistMIS
Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation
ngrabaskas/Torch-Automatic-Distributed-Neural-Network
Torch Automatic Distributed Neural Network (TorchAD-NN) training library. Built on top of TorchMPI, this module automatically parallelizes neural network training.
oekosheri/pytorch_unet_scaling
Scaling Unet in Pytorch
axr6077/Hogdkin-Huxley-Neuron-Model
Sequential and Parallel Implementation of the Hodgkin-Huxley Neuron model.
oekosheri/tensorflow_unet_scaling
Scaling Unet in Tensorflow
oriolaranda/DistMIS
Official Repository for the paper: Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation
Sujith013/Binary-Classification-using-Machine-Learning-and-Data-parallelism
Binary data classification using TensorFlow and Keras in python and achieving data parallelism using MPI