vLLM

Pinned Repositories

aibrix
Cost-efficient and pluggable Infrastructure components for GenAI inference
Language:Jupyter Notebook3.4k 39 454319
dashboard
vLLM performance dashboard
Language:Python24 1 06
flash-attention
Fast and memory-efficient exact attention
Language:Python58 3 054
llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Language:Python1.2k 18 221111
production-stack
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
Language:Python999 15 80136
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python43.8k 365 7.8k6.7k
vllm-ascend
Community maintained hardware plugin for vLLM on Ascend
Language:Python436 8 17174
vllm-nccl
Manages vllm-nccl dependency
Language:Python17 1 33
vllm-project.github.io-static
Language:HTML8 5 07
vllm-spyre
Community maintained hardware plugin for vLLM on Spyre
Language:Python17 8 279

vLLM's Repositories

vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python43.8k 365 7.8k6.7k
vllm-project/aibrix
Cost-efficient and pluggable Infrastructure components for GenAI inference
Language:Jupyter Notebook3.4k 39 454319
vllm-project/llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Language:Python1.2k 18 221111
vllm-project/production-stack
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
Language:Python999 15 80136
vllm-project/vllm-ascend
Community maintained hardware plugin for vLLM on Ascend
Language:Python436 8 17174
vllm-project/flash-attention
Fast and memory-efficient exact attention
Language:Python58 3 054
vllm-project/dashboard
vLLM performance dashboard
Language:Python24 1 06
vllm-project/vllm-nccl
Manages vllm-nccl dependency
Language:Python17 1 33
vllm-project/vllm-spyre
Community maintained hardware plugin for vLLM on Spyre
Language:Python17 8 279
vllm-project/buildkite-ci
Language:HCL8 3 120
vllm-project/vllm-project.github.io-static
Language:HTML8 5 07
vllm-project/vllm-project.github.io
Language:HTML6 16 014
vllm-project/FlashMLA
Language:C++5 0 01
vllm-project/media-kit
vLLM Logo Assets
2 2 01
vllm-project/vllm-openvino
20
vllm-project/vllm_allocator_adaptor
An adaptor to allow Python allocator for PyTorch pluggable allocator
Language:C++2 1 01