/instructGOOSE

Implementation of Reinforcement Learning from Human Feedback (RLHF)

Primary LanguageJupyter NotebookMIT LicenseMIT

Issues