/ReNeLLM

The official implementation of our NAACL 2024 paper "A Wolf in Sheep’s Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily".

Primary LanguagePythonMIT LicenseMIT

Issues