Pinned Repositories
flashinfer
FlashInfer: Kernel Library for LLM Serving
lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
DistServe
Disaggregated serving system for Large Language Models (LLMs).
openvino
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
sglang
SGLang is a fast serving framework for large language models and vision language models.
vllm-prefix-caching
A high-throughput and memory-efficient inference and serving engine for LLMs
Wine-Quality-Analysis
用机械学习分析和分类好质量的红酒特征
ScaleLLM
A high-performance inference system for large language models, designed for production environments.
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
inference-server
sitabulaixizawaluduo's Repositories
sitabulaixizawaluduo/vllm-prefix-caching
A high-throughput and memory-efficient inference and serving engine for LLMs
sitabulaixizawaluduo/Wine-Quality-Analysis
用机械学习分析和分类好质量的红酒特征