Awesome Reinforcement Learning from Human Feedback

A collection of resources on Reinforcement Learning from Human Feedback (RLHF), mainly focused on pretrained models.

📜 Papers & Blog

Transformer Reinforcement Learning (TRL)：Train GPT type transformers model with Proximal Policy Optimization (PPO)
Transformer Reinforcement Learning X (TRLX)：Enhanced TRL with Implicit Language Q-Learning (ILQL)
RL4LMs (A modular RL library to fine-tune language models to human preferences) [Site]：Thoroughly tested and benchmarked with over 2000 experiments on Language Generation tasks, with different types of metrics, and several RL algorithms. Also support Seq2Seq type Model (eg. T5, BART).

If you have any question, please feel free to contact me (📧: andy.yangzhen@gmail.com).