Issues
- 0
jetson的盒子可以安装vllm吗
#10 opened by KungFuPandaPro - 3
vllm推理提速不明显,如何解决?
#7 opened by zzyzeyuan - 3
运行vllm_offline.py报错
#8 opened by zjjznw123 - 0
流式处理如何实现批量推理?
#9 opened by Simple6K - 0
请问本仓库的vllm是哪一个版本的?
#5 opened by zzyzeyuan - 0
后续是否会支持多个prompts一起送入model.generate()?
#6 opened by zzyzeyuan - 6
请问是否支持Qwen1.5系列模型(不同量化方式 / 非量化)
#2 opened by tomFoxxxx - 1
lora加载如何实现
#3 opened by nlp-learner