/rl-teacher-Pytorch

Deep reinforcement learning from human preferences in Pytorch (WIP)

Primary LanguagePythonMIT LicenseMIT

Watchers