vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
PythonApache-2.0
Stargazers
- xplAustin, Texas, United States
- infwinston
- sarahwoodersBerkeley, CA
- ignoramous
- psrikhanta
- RFJBraunstingl
- uhthomasLondon
- sandeep06011991
- shadeMeIn the Great Hall and in the bladder
- BecomeAllanBrazil
- NonMundaneDevNigeria
- danskycode
- ZhengyaoJiangLondon, UK
- breakdsSan Francisco Bay Area
- RezaYazdaniAminabadiVancouver, Canada
- senkoZagreb, Croatia
- iseanstevensSF
- chhzh123Ithaca, NY
- jinhongyiiPittsburgh
- salomartinLondon
- Sandalots
- kaiokendev
- denisfitz57
- Xiuyu-Li
- brandon-lockabyGreenville, SC
- DanielNill
- zygmuntz
- MWARDUNIDenver, CO
- ianmobbsSan Francisco Bay Area
- whyborisMichigan
- tobiOttawa, Canada
- VivekhazNew York City
- tokestermwSan Francisco, CA
- TejaGollapudiBay Area
- null-devCanada
- malcolmgreavesNYC