GanjinZero/RRHF

[NIPS2023] RRHF & Wombat

Python

Issues

RRHF with Online Sampling
#57 opened 4 months ago by sqqiao
1
resize embedding after add_special_tokens
#56 opened 4 months ago by Switchsyj
0
Runtime error：数据类型报错
#55 opened 6 months ago by sqqiao
0
关于ppl的方差
#54 opened a year ago by skepsun
1
算loss的时候求均值的时候是不是可以优化
#51 opened a year ago by shyoulala
6
bug 计算sft损失的时候
#48 opened a year ago by shyoulala
2
如果我想将模型更改为baichuan2-7b-chat，需要做哪些方面的变动？
#52 opened a year ago by IT-five
3
loss的代码关于batch size的处理有bug。
#23 opened 2 years ago by echoht
4
请问基于Vicuna测试集的比较是如何进行比较的？
#53 opened a year ago by IT-five
0
数据构造问题
#50 opened a year ago by lylcst
1
Label Shifts
#49 opened a year ago by yafuly
0
The calculation about rrhf loss in the code seems to be completely wrong
#29 opened 2 years ago by yyhycx
1
关于alpaca-7B和LLaMA-7B
#47 opened a year ago by NEUBuffett
3
dummy_target的请教
#45 opened a year ago by xunfengzhangyang
5
有关IMDB数据集的问题
#46 opened a year ago by stevie1023
2
dummy_target的请教
#44 opened a year ago by xunfengzhangyang
1
加载模型的问题
#43 opened a year ago by LiangZhuuu
11
损失函数
#42 opened a year ago by xiayouhong
3
训练过程OOM的问题
#41 opened a year ago by Guochry
1
can RRHF train on v100 32G?
#20 opened 2 years ago by akk-123
24
Wombat与RRHF
#40 opened a year ago by Guochry
4
The generation config for evaluation
#39 opened a year ago by stevie1023
6
在单卡A100上训练出现torch.distributed.elastic.multiprocessing.api.SignalException: Process 2920830 got signal: 1
#35 opened a year ago by Zhang-Each
2
labels != -100的作用是什么
#38 opened a year ago by LSX-Sneakerprogrammer
3
The size of tensor a (8) must match the size of tensor b (2) at non-singleton dimension 1
#36 opened a year ago by ZJXNEFU
11
RRHFTrainer.gather_logits_labels label in-place operation error
#37 opened a year ago by asadfgglie
8
NameError: name 'save_fsdp_model' is not defined
#33 opened a year ago by ZJXNEFU
4
评估方法与位置有很大关系
#32 opened 2 years ago by xiaoyuan1996
2
the evaluation script with average reward score (Dahoas/gptj-rm-static)
#34 opened a year ago by stevie1023
5
对于重复score答案样本的处理疑问
#25 opened 2 years ago by yanhan19940405
7
wombat-7B的输出异常
#21 opened 2 years ago by lx86110
15
CUDA out of memory when trainer.model.state_dict()
#30 opened 2 years ago by Akiraxty
2
期待LoRA或ptuning
#31 opened 2 years ago by Noyce765103
1
How to use it. Is there some code examples?
#28 opened 2 years ago by Mr-IT007
1
PPL
#26 opened 2 years ago by SuMeng123
7
一些训练细节
#27 opened 2 years ago by xiaoyuan1996
2
training with my own gpt2
#22 opened 2 years ago by dyyzhmm
1
PPO implementation
#19 opened 2 years ago by yuzc19
2
Wombat-7B，Wombat-7B-gpt4 and ChatGPT Results on Comparison based on Vicuna test set, evaluation by gpt-4.
#18 opened 2 years ago by onlyfish79
4
有关训练模型细节
#17 opened 2 years ago by yanhan19940405
12
Results on Comparison based on Vicuna test set
#16 opened 2 years ago by LeeShiyang
1
Why use HingeLoss instead of BPRLoss ?
#15 opened 2 years ago by KID-22
1
single_sentence_inference output is empty
#14 opened 2 years ago by better629
10
We are trying to evaluate Wombat on Vicuna test set, but we do not have GPT4 API.
#11 opened 2 years ago by GanjinZero
0
This loss seems to consume a lot of memory.
#13 opened 2 years ago by piekey1994
4
Error when try to inference
#12 opened 2 years ago by oasis-0927
5
RRHF only works on llama model.
#8 opened 2 years ago by Taekyoon
16
Wombat weights release?
#3 opened 2 years ago by generalsvr
4
[MiniDiscussion] RRHF is similar to imitation learning
#4 opened 2 years ago by mickelliu
5