Geeks

thomfoster/minRLHF

A (somewhat) minimal library for finetuning language models with PPO on human feedback.

Python

Readme
5Issues
86Stargazers
1Watcher

Watchers

thomfoster

Contact site admin: Geeks.