Deep reinforcement learning from human preferences in Pytorch (WIP)
Primary LanguagePythonMIT LicenseMIT