Self-Alignment with Principle-Following Reward Models
Primary LanguagePythonGNU General Public License v3.0GPL-3.0