[Feature Request] multi-turn reward for RLHF
vmoens opened this issue · 1 comments
vmoens commented
Implement rewards as proposed in https://arxiv.org/pdf/2405.14655
ggbondcxl commented
I am very interested in multi-turn RLHF, can you give a sample code