entelecheia/openllm-container

Support vLLM

Closed this issue · 0 comments

Pull the Docker image with CUDA 11.8.

Use --ipc=host to make sure the shared memory is large enough.
docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:22.12-py3

https://vllm.readthedocs.io/en/latest/getting_started/installation.html