TASKS=llm,rag模式下,出现线程问题报错:RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
Opened this issue · 3 comments
syusama commented
提交前必须检查以下项目 | The following items must be checked before submission
- 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
- 我已阅读项目文档和FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 | I have searched the existing issues / discussions
问题类型 | Type of problem
模型推理和部署 | Model inference and deployment
操作系统 | Operating system
Linux
详细描述问题 | Detailed description of the problem
Ubuntu系统
docker-compose部署
镜像api-llm:vllm
当同时部署llm和embedding模型时
TASKS=llm,rag
会报错:
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
单独部署llm则没有问题
运行日志或截图 | Runtime logs or screenshots
FreeRotate commented
你好,请问这个问题解决了吗?我也报同样的错误
syusama commented
你好,请问这个问题解决了吗?我也报同样的错误
没有解决,我后来是分别启一个llm,一个rag
FreeRotate commented
你好,请问这个问题解决了吗?我也报同样的错误
没有解决,我后来是分别启一个llm,一个rag
我刚解决了,回退vllm和torch版本,使用vllm==0.4.2,torch==2.3.0