Issues
- 0
- 1
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xce in position 4411: invalid continuation byte
#489 opened by huqiangDu - 0
转换 Llama-2-7b 模型出错
#501 opened by JocelynPanPan - 0
Main.exe compiled with w64devkit v.1.21.0 silently terminates after launching.
#499 opened by JohnClaw - 0
Thanks for great llm inference engine. Could you make a dll version of it, please?
#498 opened by JohnClaw - 0
如何配置参数使服务处理请求并发数最大化
#494 opened by xiaoshizijiayou - 0
error: no suitable user-defined conversion from "__half" to "__nv_bfloat16" exists
#493 opened by xiaoshizijiayou - 4
ModuleNotFoundError: No module named 'ftllm'
#492 opened by mingyue0094 - 28
如何通过参数方式直接加载adapter?
#491 opened by xiaoshizijiayou - 1
加速llama3-sqlcoder-8b (Finetuned from model: [Meta-Llama-3-8B-Instruct])模型时,输出内容错误,全部是"!!!!!"
#487 opened by Juvember - 1
pytorch模型转flm模型Killedt
#488 opened by scutzhe - 3
结果返回一直是<unk>
#452 opened by VincentLore - 1
模型权重转化之后和原来的模型回答的内容不一致
#486 opened by Whylickspittle - 1
编译完之后运行模型时报错
#484 opened by supercj92 - 0
chatglm 失去 function calling 能力
#485 opened by NingRiCheng - 1
请问一下国产显卡Ascend 910 and Hygon DCU如何安装fastllm?
#482 opened by cgq0816 - 0
GLM4-V-9B什么时候会出部署代码呢?
#481 opened by GalSang17 - 1
如何多卡部署
#480 opened by longcheng183 - 1
OSError: libcublas.so.ll: cannot open shared odject file: No such file or directory
#471 opened by lichengyang666 - 5
Meta-Llama-3-70B-Instruct
#470 opened by longcheng183 - 3
make -j过程中报错
#459 opened by AIlaowong - 4
请问什么时候支持GLM-4 ?
#462 opened by Stupid-Ai - 5
GLM-4-6B-Chat转换成flm格式后不能加载
#465 opened by HofNature - 1
请问现在支持deepseekv2量化吗
#457 opened by fw2325 - 1
H800 docker 编译, half类型转换 编译报错
#463 opened by ShadowTeamCN - 0
qwen1.5 int4模型回复出现解码问题:UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 72-73: invalid continuation byte
#461 opened by zhang415 - 0
- 0
chatglm3 相同提示词生成结果一致
#450 opened by ttaop - 0
Do you have a plan to implement the CudaCatOp?
#448 opened by dp-aixball - 1
中文输入无法识别;webui打开的地址无法访问。
#447 opened by Mihubaba - 2
千问qwen1.5-14B-chat解码错误
#446 opened by yiguanxian - 2
cmake -j报错
#445 opened by gggdroa - 1
无法安装fastllm_pytools
#443 opened by bailingchun - 0
- 1
模型转换的时候是不是不能用量化过的模型
#437 opened by shum-elli - 0
是否支持qwen1.5的滑动窗口的方式
#436 opened by aofengdaxia - 0
大佬您好,这个性能和chatglm.cpp比起来,会更好吗
#435 opened by ericjing83 - 3
- 0
fastllm是否支持使用bitsandbytes量化的chatglm3-6b-base int4模型
#434 opened by levinxo - 0
- 5
ResponseBatch 返回结果不正确
#429 opened by Liufeiran123 - 0
请求支持Grouped Query Attention
#416 opened by TylunasLi - 0
batch padding mask 处理的相关代码
#427 opened by Liufeiran123 - 1
qwen输出结果错误
#418 opened by Liufeiran123 - 0
- 2
后续能否支持ChatGLM3的多轮
#419 opened by chenyangjun45 - 1
- 2
转化模型格式(.bin->.flm)时
#413 opened by ColorfulDick - 2
大佬 想问下 利用率只跑到60% 是什么情况?
#414 opened by Chenhuaqi6 - 2
当输出数据特别长的时候报错。
#409 opened by aofengdaxia