/Awesome-RLHF

Awesome Reinforcement Learning from Human Feedback, the secret behind ChatGPT XD

MIT LicenseMIT

Watchers