/rlhf-book

Textbook on reinforcement learning from human feedback

Primary LanguageTeXMIT LicenseMIT

Stargazers