Pinned issues
Issues
- 9
- 2
从 chat 接口内部调用 generate 接口的处理逻辑看,使用上述拼接方式生成的 input_ids 不符合你们对于特殊符(如<|user|>、<|assistant|>)的 id 定义,这部分是否只是为了兼容通用的 generate 接口?且存在对模型性能的损失?
#1256 opened by Tesla-jiang - 1
LORA 微调报错
#1242 opened by ZhuXuesong7423 - 1
运行basic_demo下的web_demo_gradio.py程序报错ModuleNotFoundError: No module named 'peft'
#1253 opened by miracles-zhang - 1
- 1
- 1
openai_api_request.py运行不成功
#1258 opened by SENVENHUHU - 1
api模式下是不能实现工具调用吗?
#1249 opened by 4ooooo - 0
langchain_demo中的那个是不是不是流式处理?
#1254 opened by ciaoyizhen - 0
- 0
- 2
拼接格式与 chat 接口处理逻辑是否冲突
#1238 opened by Tesla-jiang - 2
lora微调后没有pytorch_model.bin
#1203 opened by zhengshi119 - 4
进行p-tuning-v2微调时,报如下错误
#1237 opened by cskaoyan - 1
lora微调报错
#1239 opened by xiaohaiqing - 2
composite_demo 代码解释器 无法生成 图片
#1187 opened by Aike505 - 5
量化加载chatglm3,报错:round_vml_cpu not implemented for Half
#1217 opened by imempty - 0
- 0
RMSNorm的不同实现方式
#1240 opened by trundleyrg - 3
启动/openai_api_demo/api_server.py 使用stream 方式请求接口/v1/chat/completions ,时而返回数据,时而不返回数据
#1177 opened by caijx168 - 1
requirements.txt的问题
#1231 opened by zhfish - 3
problems when finetuning with lora
#1232 opened by RemiAlliah - 0
【求助】关于算法备案的问题
#1233 opened by bh4ffu - 3
- 4
运行抛异常:CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions
#1228 opened by hotcolaava - 2
api_server.py只能用post请求调用吗
#1204 opened by Yapeng-Gao - 1
- 0
- 1
关于微调多轮对话生成的loss_mask格式问题
#1213 opened by RyanOvO - 3
ptuning_v2 微调后使用inference_hf.py推理 出现Both `max_new_tokens` (=512) and `max_length`(=8192) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
#1215 opened by 52566rz - 8
官方的所有openai_api实现的函数调用全部失效了
#1207 opened by jnchen - 1
openai_api.py 支持并发调用,或者怎么实现并发调用
#1216 opened by qinzhenyi1314 - 4
128k的模型使用lora微调后,进行推理的时候卡住怎么回事?
#1181 opened by dazzlingCn - 1
GLM大模型是Causal Language Model类型的模型么?
#1205 opened by RyanOvO - 3
- 3
Streamlit 启动 composite_demo后页面加载报错
#1182 opened by wli173-ford - 1
quantization failed
#1197 opened by qslia - 0
单机多卡Lora微调总是出现nccl错误
#1174 opened by Hxinyue - 5
Running Prediction
#1189 opened by sleep-zzw-bot - 0
- 1
chatglm4啥时候开源?
#1193 opened by njhouse365 - 2
- 2
langchain DEMO问题:无法调用tools
#1185 opened by Lizhli2825 - 1
ChatGLM3分词器的model.vocab能不能提供一下
#1184 opened by CNUIGB - 0
多卡运行 OMP_NUM_THREADS=1 torchrun --standalone --nnodes=1 --nproc_per_node=8 finetune_hf.py data/AdvertiseGen/ THUDM/chatglm3-6b configs/lora.yaml configs/ds_zero_2.json
#1183 opened by cqray1990 - 1
call tool 微调,报错
#1176 opened by koryako - 3
- 1
lora微调后怎么用网页版demo加载微调后的模型
#1172 opened by dasaffa - 0
4090显卡使用官方的ptuning_v2.yaml来进行微调时出现显存不足的问题
#1168 opened by Ayanami233e - 1
Chatglm3-6b使用官方的ptuning_v2.yaml进行微调,4090显卡报错显存不足
#1166 opened by Ayanami233e