Issues
- 7
- 0
- 0
Look for open-source reward models or datasets
#24 opened by NEUBuffett - 0
Modify the PPO implementation
#22 opened by AlisonWen - 2
Unable to install the environment
#20 opened by liuyeah - 2
'POST' data size, is there a limit?
#11 opened by emanokaro - 0
Split descriptions for Axis data set
#19 opened by UntotaufUrlaub - 2
Use which part of dataset to finetune model?
#18 opened by xuyifan-0731 - 1
- 3
Why the reward model loss?
#9 opened by ghosthamlet - 1
Dataset Links don't work
#16 opened by dhlee347 - 0
[Q] How to use the model for inference?
#13 opened by NightMachinery - 1
ERROR: Couldn't install package: mpi4py
#12 opened by yxli2123 - 7
- 0
- 2
License for dataset?
#8 opened by AdamGleave - 1
- 2
- 6
human feedback in validation dataset?
#3 opened by ShiYaya - 6
- 1
Looking for RL algorithm implementation
#1 opened by zikunukiz