UCSB-NLP-Chang/SemanticSmooth

Implementation of paper 'Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing'

PythonMIT

Stargazers