/minRLHF

A (somewhat) minimal library for finetuning language models with PPO on human feedback.

Primary LanguagePython

Stargazers

No one’s star this repository yet.