pipeline-parallelism
There are 27 repositories under pipeline-parallelism topic.
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
deepspeedai/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
bigscience-workshop/petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
kakaobrain/torchgpipe
A GPipe implementation in PyTorch
PaddlePaddle/PaddleFleetX
飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。
Coobiw/MPP-LLaVA
Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.
Oneflow-Inc/libai
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
InternLM/InternEvo
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
alibaba/EasyParallelLibrary
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
Shenggan/awesome-distributed-ml
A curated list of awesome projects and papers for distributed training or inference
torchpipe/torchpipe
Serving Inside Pytorch
ai-decentralized/BloomBee
Decentralized LLMs fine-tuning and inference with offloading
xrsrke/pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
AlibabaPAI/DAPPLE
An Efficient Pipelined Data Parallel Approach for Training Large Model
ParCIS/Chimera
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
saareliad/FTPipe
FTPipe and related pipeline model parallelism research.
gty111/gLLM
gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling
MLSysU/TD-Pipe
A Throughput-Optimized Pipeline Parallel Inference System for Large Language Models
nawnoes/pytorch-gpt-x
Implementation of autoregressive language model using improved Transformer and DeepSpeed pipeline parallelism.
fanpu/DynPartition
Official implementation of DynPartition: Automatic Optimal Pipeline Parallelism of Dynamic Neural Networks over Heterogeneous GPU Systems for Inference Tasks
garg-aayush/model-parallelism
Model parallelism for NN architectures with skip connections (eg. ResNets, UNets)
torchpipe/torchpipe.github.io
Docs for torchpipe: https://github.com/torchpipe/torchpipe
explcre/pipeDejavu
pipeDejavu: Hardware-aware Latency Predictable, Differentiable Search for Faster Config and Convergence of Distributed ML Pipeline Parallelism
LER0ever/HPGO
Development of Project HPGO | Hybrid Parallelism Global Orchestration
joe0731/hf_vram_calc
A CLI tool for estimating GPU VRAM requirements for Hugging Face models, supporting various data types, parallelization strategies, and fine-tuning scenarios like LoRA.
1set-t/ai-model
Industrial-grade weather visualization system that transforms AI model predictions into professional meteorological plots, emphasizing operational forecasting capabilities.
sparklerz/multigpu-llm-finetuning
This repository showcases hands-on projects leveraging distributed multi-GPU training to fine-tune large language models (LLMs).