4卡A100 使用VLLM推理Qwen2-72B,报错(VllmWorkerProcess pid=10898) ERROR 09-04 17:23:44 multiproc_worker_utils.py:226] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
TuDaCheng opened this issue · 0 comments
TuDaCheng commented
Search before asking
- I had searched in the issues and found no similar issues.
Operating system information
Linux
Python version information
3.10
DB-GPT version
latest release
Related scenes
- Chat Data
- Chat Excel
- Chat DB
- Chat Knowledge
- Model Management
- Dashboard
- Plugins
Installation Information
-
AutoDL Image
-
Other
Device information
4卡 A100 160g
Models information
text2vec-large-chinese
What happened
4卡A100 使用VLLM推理Qwen2-72B,报错(VllmWorkerProcess pid=10898) ERROR 09-04 17:23:44 multiproc_worker_utils.py:226] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
What you expected to happen
多卡推理出问题
How to reproduce
1
Additional context
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!