openai/summarize-from-feedback

Code for "Learning to summarize from human feedback"

PythonNOASSERTION

Issues

Is it possible to run the model on Google Colab?
#7 opened 4 years ago by toniju98
7
Question on the SFT/reward model/PPO dataset numbers
#25 opened 9 months ago by hanyin88
0
Look for open-source reward models or datasets
#24 opened 10 months ago by NEUBuffett
0
Modify the PPO implementation
#22 opened a year ago by AlisonWen
0
Unable to install the environment
#20 opened 2 years ago by liuyeah
2
'POST' data size, is there a limit?
#11 opened 3 years ago by emanokaro
2
Split descriptions for Axis data set
#19 opened 2 years ago by UntotaufUrlaub
0
Use which part of dataset to finetune model?
#18 opened 2 years ago by xuyifan-0731
2
`openaipublic.blob.core.windows.net` has a broken CDN
#17 opened 2 years ago by JoeyBlogs
1
Why the reward model loss?
#9 opened 3 years ago by ghosthamlet
3
Dataset Links don't work
#16 opened 3 years ago by dhlee347
1
[Q] How to use the model for inference?
#13 opened 3 years ago by NightMachinery
0
ERROR: Couldn't install package: mpi4py
#12 opened 3 years ago by yxli2123
1
How to assign rewards to each time step in a generation trajectory?
#6 opened 4 years ago by ArvinZhuang
7
Is it possible to run the model on customized <doc, summary> samples?
#10 opened 4 years ago by hanzou007
0
License for dataset?
#8 opened 4 years ago by AdamGleave
2
Can you please opensoruce the reward model training dataset.
#4 opened 4 years ago by shamanez
1
cuda runtime error (101) : invalid device ordinal
#5 opened 4 years ago by ioannist
2
human feedback in validation dataset?
#3 opened 4 years ago by ShiYaya
6
Trying to run the model on an AWS instance but getting many errors...
#2 opened 4 years ago by aced125
6
Looking for RL algorithm implementation
#1 opened 4 years ago by zikunukiz
1