vllm-project/vllm

Improve Cuda compatibility of vllm-openai image

RonanKMcGovern opened this issue · 2 comments

Currently the 'https://hub.docker.com/r/vllm/vllm-openai/' image uses Cuda 12.1 - this runs into a lot of cuda versioning issues depending on the drivers used on the underlying GPU.

This makes the image an inconsistent starting point for running on services like vast ai or runpod.

Could the docker image be updated to more dynamically support cuda versions from 11.8 and up?

for comparison, the text-generation-inference docker image doesn't have these issues. See here

really expect an official image that supports cuda 11.8

or please 🙏 provide a guide on how to build cuda 11.8 version vllm-openai image