/safety-rbr-code-and-data-pq

Code and example data for the paper: Rule Based Rewards for Language Model Safety

Primary LanguageJupyter NotebookMIT LicenseMIT

No issues in this repository yet.