distributed-deep-learning

There are 37 repositories under distributed-deep-learning topic.

intel/BigDL
BigDL: Distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray
Language:Jupyter Notebook2.7k 107 7731
dkeras-project/dkeras
Distributed Keras Engine, Make Keras faster with only one line of code.
Language:Python189 7 412
dyadxmachina/Applied-Deep-Learning-with-TensorFlow
Learn applied deep learning from zero to deployment using TensorFlow 1.8+
Language:Jupyter Notebook162 27 349
zoranzhao/DeepThings
A Portable C Library for Distributed CNN Inference on IoT Edge Clusters
Language:C83 4 840
GuanhuaWang/sensAI
sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data
Language:Python64 8 68
ParCIS/Chimera
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
Language:Python61 2 38
vdutts7/dnn-distributed
Distributed training of DNNs • C++/MPI Proxies (GPT-2, GPT-3, CosmoFlow, DLRM)
Language:C++42 2 012
rocketmlhq/rmldnn
RocketML Deep Neural Networks
Language:Jupyter Notebook41 6 49
intel/e2eAIOK
Intel® End-to-End AI Optimization Kit
Language:Jupyter Notebook31 2 10520
rkhan055/SHADE
SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training
Language:Python31 1 210
ParCIS/Ok-Topk
Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k communication volume which is asymptotically optimal) with the decentralized parallel Stochastic Gradient Descent (SGD) optimizer, and its convergence is proved theoretically and empirically.
Language:Python25 2 48
gsyang33/Driple
🚨 Prediction of the Resource Consumption of Distributed Deep Learning Systems
Language:Python15 11 312
christianramsey/Tensorflow-for-Distributed-Deep-Learning
TensorFlow (1.8+) Datasets, Feature Columns, Estimators and Distributed Training using Google Cloud Machine Learning Engine
Language:Jupyter Notebook12 3 04
ravenprotocol/ravnest
Decentralized Asynchronous Training on Heterogeneous Devices
Language:Python10 5 00
Shigangli/eager-SGD
Eager-SGD is a decentralized asynchronous SGD. It utilizes novel partial collectives operations to accumulate the gradients across all the processes.
Language:Python8 2 00
ray-project/anyscale-workshop-nyc-2023
Scalable NLP model fine-tuning and batch inference with Ray and Anyscale
Language:Jupyter Notebook6 3 01
Shigangli/WAGMA-SGD
WAGMA-SGD is a decentralized asynchronous SGD based on wait-avoiding group model averaging. The synchronization is relaxed by making the collectives externally-triggerable, namely, a collective can be initiated without requiring that all the processes enter it. It partially reduces the data within non-overlapping groups of process, improving the parallel scalability.
Language:Python6 1 00
deepspark/deepspark_java
Java based Convolutional Neural Network package running on Apache Spark framework
Language:Java4 1 25
amirhosein-mesbah/Deep_Learning
This repository contains the implementation of a wide variety of Deep Learning Projects in different applications of computer vision, NLP, federated, and distributed learning. These projects include university projects and projects implemented due to interest in Deep Learning.
Language:Jupyter Notebook2 1 00
lancelee82/necklace
Distributed deep learning framework based on pytorch/numba/nccl and zeromq.
Language:Python2 3 02
StefanoFioravanzo/distributed-deeplearning-kubernetes
Collection of resources for automatic deployment of distributed deep learning jobs on a Kubernetes cluster
Language:Python2 2 01
AmrMKayid/KayDDRL
Distributed Deep Reinforcement Learning for Large Scale Robotic Simulations 👨‍💻🤖🕸🕹🕷❤️👨‍🔬
Language:TeX1 3 21
explcre/SHUKUN-Technology-AlgorithmIntern-MultiNodeTraining-for-DLmodels-Horovod-ConfigurationTutorial-Perf
SHUKUN Technology Co.,Ltd Algorithm intern (2020/12-2021/5). Multi-GPU, Multi-node training for deep learning models. Horovod, NVIDIA clara train sdk, configuration tutorial,performance testing.
Language:HTML1 1 00
trilliwon/pytorch-examples
PyTorch Examples for Beginners
Language:Jupyter Notebook1 2 01
veritas9872/Horovod-Pytorch-Tutorial
Horovod Tutorial for Pytorch using NVIDIA-Docker.
Language:Python1 0 00
hkvision/analytics-zoo
Distributed Tensorflow, Keras and BigDL on Apache Spark
Language:Jupyter Notebook0 0 00
mma735/TFM-DS
Comparison of distributed machine learning techniques applied to openly available datasets
Language:Jupyter Notebook0 1 00
siddhanthiyer-99/Distributed-Training-of-GANs
Implemented training strategies to help improve bottlenecks and to improve the training speed while maintaining the quality of our GANs.
Language:Python0 2 00
sqaz91819/Blockchain-NAS
A blockchain based neural architecture search project.
Language:Python0 1 00
thanoskaravangelis/distributed-deep-learning-ntua
Distributed Deep Learning experiments with the BigDL framework over Databricks
Language:Jupyter Notebook0 0 00
bilalsp/yelp-distributed-DL
Yelp review classification using CNN model with horovod on HPC cluster
2 0
ch3njust1n/smpl
Simultaneous Multi-Party Learning Framework
Language:Python2 126
hyunnnchoi/google-t5-fsdp-kubeflow
A foundational repository for setting up distributed training jobs using Kubeflow and PyTorch FSDP.
Language:Python2 0
pierric/Mnist-Caffe-MPI
mnist, using caffe and openmpi
Language:C++3 0
smmehrab/distributed-deep-learning
Distributed Deep Learning
Language:Jupyter Notebook1 0
sotheanithsok/Image-Recognition-using-Distributed-ResNet-Model
An implementation of a distributed ResNet model for classifying CIFAR-10 and MNIST datasets.
Language:Python1 0

distributed-deep-learning

intel/BigDL

dkeras-project/dkeras

dyadxmachina/Applied-Deep-Learning-with-TensorFlow

zoranzhao/DeepThings

GuanhuaWang/sensAI

ParCIS/Chimera

vdutts7/dnn-distributed

rocketmlhq/rmldnn

intel/e2eAIOK

rkhan055/SHADE

ParCIS/Ok-Topk

gsyang33/Driple

christianramsey/Tensorflow-for-Distributed-Deep-Learning

ravenprotocol/ravnest

Shigangli/eager-SGD

ray-project/anyscale-workshop-nyc-2023

Shigangli/WAGMA-SGD

deepspark/deepspark_java

amirhosein-mesbah/Deep_Learning

lancelee82/necklace

StefanoFioravanzo/distributed-deeplearning-kubernetes

AmrMKayid/KayDDRL

explcre/SHUKUN-Technology-AlgorithmIntern-MultiNodeTraining-for-DLmodels-Horovod-ConfigurationTutorial-Perf

trilliwon/pytorch-examples

veritas9872/Horovod-Pytorch-Tutorial

hkvision/analytics-zoo

mma735/TFM-DS

siddhanthiyer-99/Distributed-Training-of-GANs

sqaz91819/Blockchain-NAS

thanoskaravangelis/distributed-deep-learning-ntua

bilalsp/yelp-distributed-DL

ch3njust1n/smpl

hyunnnchoi/google-t5-fsdp-kubeflow

pierric/Mnist-Caffe-MPI

smmehrab/distributed-deep-learning

sotheanithsok/Image-Recognition-using-Distributed-ResNet-Model