EmbeddedLLM/vllm

vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs

PythonApache-2.0

Readme
13Issues
89Stargazers
2Watchers

Watchers

sarutobiumon
tanpinsiang

Contact site admin: Geeks.