Issues
- 4
求一个新的 TLLM 群交流群二维码
#113 opened by Missmiaom - 1
请问新手该如何正确的学习项目代码
#124 opened by Huziyou - 2
run.py error
#123 opened by caoquanjie - 5
请问目前的Qwen-VL实现方式,是否仅支持输入单张图片,且图片必须在输入的开头?
#90 opened by xikaluo - 1
- 2
tp_size=4 build 报错
#118 opened by mogoxx - 0
what's the pp_size means?
#122 opened by UIHCRITT - 1
- 2
qwen1.5 编译错误:KeyError: 'kv_cache_block_pointers_list'
#107 opened by whk6688 - 3
在使用脚本qwen build时出现bug
#117 opened by zixuxu000 - 2
Qwen1.5-32B-Chat-GPTQ-Int4 构建失败
#119 opened by panjican - 4
Qwen-32b prompt cache不支持
#116 opened by wangye360 - 1
求助
#114 opened by Bilibili-Mikoto - 1
- 3
当我用perf_analyzer测试性能时,出现“Thread [0] had error: Cannot send stop request without specifying a request_id”错误
#112 opened by MuyeMikeZhang - 1
swift微调的qwen-vl支持吗
#75 opened by xs818818 - 27
how to build Qwen-72B-Chat-Int4 with tp=2
#94 opened by liyunhan - 11
- 1
想请教一下qwen1.5_7b和llama系的区别
#108 opened by DBCGary - 6
Triton 和 Langchain部署问题
#95 opened by plt12138 - 17
- 2
How to use multi gpus in qwen2/quantize.py?
#105 opened by qy1026 - 2
请教一个codeqwen7b模型build过程的一个问题
#111 opened by shiqingzhangCSU - 3
Qwen1.5做smoothquant时维度不对
#110 opened by zgplvyou - 9
python run 错误
#104 opened by maozixi1 - 2
Hi, a error about kv_cache_block_pointers_list
#109 opened by lll143653 - 2
大佬有没有对比和VLLM的推理效果?
#72 opened by white-wolf-tech - 4
API for multi-GPU inference
#106 opened by UIHCRITT - 18
triton 部署, 生成乱码
#101 opened by maozixi1 - 2
web_demo无法显示模型响应
#103 opened by elegant-bot - 3
qwen_14b_chat build error
#100 opened by AlgoJay1991 - 4
编译tritonserver 镜像 失败
#96 opened by maozixi1 - 23
测试hf吞吐OOM以及triton并发、流式输出问题
#81 opened by dongteng - 2
Qwen-72B-Chat-Int4 killed
#82 opened by Hukongtao - 15
triton同步异步接口询问
#91 opened by dongteng - 8
运行run.py报错,Segmentation fault (core dumped)
#93 opened by ArlanCooper - 2
- 2
运行build文件报错: TypeError: RowLinear.__init__() got an unexpected keyword argument 'instance_id'
#86 opened by ArlanCooper - 2
请问如何支持正常的batch infer ?
#88 opened by zhangyu68 - 6
请问为什么smoothquant量化后显存占用不降低呢
#87 opened by tp-nan - 5
使用auto-gptq编译qwen_1_8B-Chat-int4官方报错'KeyError: 'transformer.h.0.attn.c_attn.qweight'
#83 opened by fmozer - 2
想问一下,为什么72B模型是实验性的呢?架构应该是一样的呀,原因是什么呢?谢谢
#84 opened by zhangjiekui - 3
ERROR: Failed to create instance: unexpected error when creating modelInstanceState
#71 opened by lyc728 - 2
Qwen1.5 GPTQ用不了
#76 opened by Pevernow - 15
Qwen1.5 GPTQ-Int4 编译失败
#77 opened by ljhssga - 1
Qwen1.5 GPTQ编译错误
#78 opened by compass-star - 5
Qwen2 编译错误
#80 opened by mogoxx - 2
- 0
- 0