vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
PythonApache-2.0
Stargazers
- 66RINGChaotic Futurism; @SJTU-IPADS
- aarnphm@bentoml
- abacajsoftware eng building things
- BabyChouSrchillin
- beamiter
- BrightXiaoHanIfun Game
- BtlmdTsinghua University
- caoshiyiUC Berkeley
- concretevitaminUC Berkeley
- fengyangyang98Tsinghua University
- gaocegege@TensorChord
- hanzz2007china
- jaywonchungUniversity of Michigan
- lambda7xxShanghai Jiao Tong University
- larme@bentoml
- Matthieu-Tinycoaching
- MichaelvllSky Computing Lab, UC Berkeley
- michaelzhiluoUC Berkeley
- MohamedAzyzChayeb
- nikitavoloboevMadrid
- pengwu22ByteDance
- PKUFlyingPigPeking University
- prnakeTsinghua University
- romilbhardwajUC Berkeley
- ryantd@kwai
- VoVAllen@Tensorchord
- wildkid1024Capital Normal University
- wolegechuStepFUN
- WoosukKwonUniversity of California, Berkeley
- wxj77
- Xiao9905Tsinghua University
- zbruceli
- zhisbug
- ZiruiOu
- ZQ-Dev8California
- ZYHowellCarnegie Mellon University