/attention-based-credit

Code for the paper: Dense Reward for Free in Reinforcement Learning from Human Feedback (ICML 2024) by Alex J. Chan, Hao Sun, Samuel Holt, and Mihaela van der Schaar

Primary LanguagePythonMIT LicenseMIT

No issues in this repository yet.