kvcache.ai

KVCache.AI is a joint research project between MADSys and top industry collaborators, focusing on efficient LLM serving.

Pinned Repositories

custom_flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda0 0 00
ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Language:Python15.1k 110 1k1.1k
Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Language:C++4k 37 255376
sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python00
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python13 0 04

kvcache-ai/ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Language:Python15.1k 110 1k1.1k
kvcache-ai/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Language:C++4k 37 255376
kvcache-ai/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python13 0 04
kvcache-ai/custom_flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda0 0 00
kvcache-ai/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python00