/worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by VLLM.

Primary LanguagePythonMIT LicenseMIT

Watchers