AHEADer's Stars
Stability-AI/stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Hannibal046/Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
ggerganov/ggml
Tensor library for machine learning
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
dair-ai/ML-Papers-of-the-Week
🔥Highlighting the top ML papers every week.
NVIDIA/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
NVIDIA/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
alpa-projects/alpa
Training and serving large-scale neural networks with auto parallelization.
PixArt-alpha/PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
modelscope/data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
FranxYao/chain-of-thought-hub
Benchmarking large language models' complex reasoning ability with chain-of-thought prompting
linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
laekov/fastmoe
A fast MoE impl for PyTorch
intelligent-machine-learning/dlrover
DLRover: An Automatic Distributed Deep Learning System
patrick-kidger/jaxtyping
Type annotations and runtime checking for shape and dtype of JAX/NumPy/PyTorch/etc. arrays. https://docs.kidger.site/jaxtyping/
srush/LLM-Training-Puzzles
What would you do with 1000 H100s...
ZhuiyiTechnology/roformer
Rotary Transformer
BatsResearch/bonito
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
AmadeusChan/Awesome-LLM-System-Papers
maitrix-org/Pandora
Pandora: Towards General World Model with Natural Language Actions and Video States
NVIDIA/NeMo-Framework-Launcher
Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.
ninehills/llm-inference-benchmark
LLM Inference benchmark
sail-sg/zero-bubble-pipeline-parallelism
Zero Bubble Pipeline Parallelism
Shenggan/awesome-distributed-ml
A curated list of awesome projects and papers for distributed training or inference
eedalong/ECE408
Code base and slides for ECE408:Applied Parallel Programming On GPU.
aschuh703/ECE408