Issues
- 2
基于ChatGLM2的RLHF训练问题
#23 opened by UltraZeroyH - 3
Pangu 2.6b 启动失败。
#25 opened by Liufeiran123 - 2
- 5
pretrain_data_v1.jsonl 文件在哪里
#24 opened by Liufeiran123 - 2
后续会考虑RLHF平替方案的集成么
#21 opened by skykiseki - 0
- 4
- 13
rlhf deepspeed和trlx能否支持 sft chatglm 6b
#13 opened by GUORUIWANG - 5
reward推理问题
#16 opened by ItGirls - 1
chatglm+RLHF
#18 opened by MAJIN123 - 1
请问支持lora方式吗
#19 opened by 70557dzqc - 3
train_rlhf-trlx.py代码问题
#15 opened by taofennanhai - 2
关于取最后一个token作为reward分数的方式
#17 opened by Bo396543018 - 7
用chatGLM-6B训RW的时候loss不收敛
#10 opened by GUORUIWANG - 1
- 1
有对比不加RLHF和加入RLHF的效果吗
#4 opened by macheng6 - 1
reward model的实现问题
#12 opened by DamonYangyang - 4
- 3
使用LoRA的GLM-10B-chinese模型是如何保存的
#9 opened by taofennanhai - 4
用GLM-10B-chinese训练RLHF过程,有没有模型并行的方式?
#11 opened by taofennanhai - 4
RLHF相关问题
#5 opened by taofennanhai - 0
deepspeed速度
#6 opened by superqing001 - 2
- 3
为什么训练的时候要加入<sep> token?
#2 opened by Nipi64310 - 2
加入RW后模型的效果
#1 opened by yxk9810