Issues
- 0
使用qwen.cpp对原模型进行转化为什么文件反而增大了?
#79 opened by zzzcccxx - 2
- 2
qwen2 support
#82 opened by bil-ash - 0
如何在x86架构上进行交叉编译到ARM64架构?
#84 opened by yux-lab - 5
- 0
对qwen1.5模型进行lora微调,怎么加载微调的参数,怎么将微调后的模型进行转换
#83 opened by Gooddz1 - 0
[BUG] 多轮对话的 prompt 应该如何构建?
#81 opened by 791136190 - 3
pip install -U qwen-cpp 报错
#58 opened by micronetboy - 2
qwen1.5 support?
#80 opened by anan1213095357 - 2
如何将gradio架构构建的前端和qwen-cpp推理代码连接?
#66 opened by tougeqaq - 8
Does it support Qwen1.5 Model?
#78 opened by kicGit - 0
Python Binding之后,如何只使用cpu进行推理呢?
#77 opened by zzzcccxx - 7
Support `--gpu-layers`
#45 opened by lindeer - 3
Python Binding 报错
#33 opened by xinbingzhe - 2
python binding无法正常安装
#43 opened by passionate11 - 1
- 2
python-bind报错 ERROR: Could not build wheels for qwen-cpp, which is required to install pyproject.toml-based projects
#52 opened by zhangzai666 - 1
qwen_cpp可以提供api接口实现web服务么
#53 opened by zhangzai666 - 2
Qwen-7B-Chat WSL GPU Error: ankerl::unordered_dense::map::at(): key not found
#29 opened by dlutsniper - 0
why missing "assistant" here
#76 opened by feixyz10 - 0
- 0
如何下载tiktoken_cpp
#74 opened by eswulei - 0
添加tokens生成速度
#73 opened by OliverQueen1466 - 0
请问用qwen.cpp量化后的模型如何使用optimum-benchmark进行性能基准测试,现在参照readme中所述只得到一个build文件夹,不清楚如何进行下一步的测试
#72 opened by suyu-zhang - 0
Why does `TextStreamer` hold on punctuation?
#71 opened by Wovchena - 0
windows 下使用qwen.cpp 问题
#70 opened by kingpingyue - 3
希望团队能继续支持qwen.cpp
#60 opened by awtestergit - 0
多轮会话
#67 opened by litongjava - 4
💡 [REQUEST] - CPU 的 qwen-cpp 如何封装为一个 http 服务?
#65 opened by micronetboy - 2
- 0
💡 [Question] - <title>qwen-cpp 只使用 cpu 和 启用 cpu BLAS 加速, 在都不使用GPU的情况下,速度有多大差别?我测试没有差别
#63 opened by micronetboy - 0
💡 [Question] - 您好,请教个问题,qwen-cpp BaseStreamer 如何通过std::string 构造一个 BaseStreamer?C++代码少一个构造方式
#62 opened by micronetboy - 0
您好,请教个问题,qwen-cpp BaseStreamer 如何通过std::string 构造一个 BaseStreamer?C++代码少一个构造方式
#61 opened by micronetboy - 1
为啥qwen.cpp在A100和A10性能差距很大
#56 opened by zhangzai666 - 0
Python Binding 如何 支持BLAS CPU 加速
#59 opened by micronetboy - 1
Python Binding在windows下无法编译
#37 opened by AppleJunJiang - 0
CUDA error 2 at /home/qwen.cpp/third_party/ggml/src/ggml-cuda.cu:7196: out of memory
#55 opened by youngallien - 9
72B模型量化需要多大内存,192G的内存都会被kill掉
#47 opened by sweetcard - 0
请问7b的模型量化需要多大的内存,我这一直显示out of memory
#51 opened by WCSY-YG - 0
- 0
代码ctx_w_size
#49 opened by EveningLin - 5
Support for AMD‘s ROCm
#46 opened by riverzhou - 6
- 2
GGML_ASSERT when using a long prompt
#44 opened by Ayahuasec - 1
Qwen-7B-Q4_0 works well on Mac M1, but Qwen-7B-Q8_0 cannot work with a ggml-metal error.
#42 opened by songkq - 1
- 0
64位linux系统pip安装qwen_cpp报错,不支持?
#32 opened by qianliyx - 0
- 0
Inferential capability of qwen.cpp for Qwen-14b-chat is different compared with Qwen-14b-chat of CUDA
#30 opened by wertyac - 0
Can you add an additional function to let convert.py support Qwen/Qwen-7B-Chat-Int4?
#28 opened by x1ngzai