xrsrke/instructGOOSE
Implementation of Reinforcement Learning from Human Feedback (RLHF)
Jupyter NotebookMIT
Issues
- 2
- 0
Add support custom reward function
#5 opened by xrsrke - 4
Not working with cuda device.
#3 opened by hemangjoshi37a - 10
This repo seems interesting.
#1 opened by hemangjoshi37a - 6