thu-coai/JailbreakDefense_GoalPriority
[ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
Python
[ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
Python