Support vLLM
Closed this issue · 0 comments
entelecheia commented
Pull the Docker image with CUDA 11.8.
Use --ipc=host
to make sure the shared memory is large enough.
docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:22.12-py3
https://vllm.readthedocs.io/en/latest/getting_started/installation.html