neuralmagic/nm-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

PythonNOASSERTION

Readme
20Issues
251Stargazers
8Watchers

Watchers

bradjonesca
New York
ChrisMii
eemailme
ghchris2021
mgoin
@neuralmagic
Qubitium
ModelCloud.ai
tlrmchlsmth
@neuralmagic
trappedinspacetime
For Personal Use

Contact site admin: Geeks.