/llm_debate_backdoor

Attempt to use this for: "ai control: improving safety despite intentional subversion"

Primary LanguagePythonMIT LicenseMIT

No issues in this repository yet.