/PromptAttack

An LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)

Primary LanguagePython

Watchers