Issues
- 3
int8量化输出不完整
#19 opened by zhenglinpan - 1
模型分片了怎么指定load_model_path
#18 opened by caowenhero - 1
你好 请问支持llama 65B了嘛
#17 opened by zcuuu - 4
多卡推理
#16 opened by yingzhao27 - 1
lora模型推理啥时候能出来啊
#15 opened by isaacxie41 - 3
fp32 精度-inference
#10 opened by biubiobiu - 1
为什么LLaMa模型只有encoder没有decoder
#14 opened by yyqi17 - 4
生成乱码
#13 opened by McCarrtney - 2
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
#11 opened by yingzhao27 - 1
运行多轮对话时代码报错
#12 opened by LJL00000 - 3
llama_server.py支持多卡推理吗
#7 opened by yuxuan2015 - 1
求老哥搞个lora的
#6 opened by ze00ro - 1