cuda out of memory
Raeps opened this issue · 1 comments
Raeps commented
sherdencooper commented
Hi, thanks for running our codes. The default vllm setting would use 95% GPU memory when creating the model ( I changed it to 98% for efficiency). Once the model is created, it should not use more GPU memory. Thus, I guess the error you got is raised when the vllm was trying to create the model. You can add gpu_memory_utilization=0.8 when creating the LocalVLLM class based on your GPU usage.