/configurable-safety-tuning

Data and models for the paper "Configurable Safety Tuning of Language Models with Synthetic Preference Data"

Primary LanguagePythonMIT LicenseMIT

Stargazers