Princeton-SysML/Jailbreak_LLM

Missing Chat Template

justinphan3110cais opened this issue · 2 comments

Hi Authors, we notice that all of the attack code are missing chat templates for models. Things like USER: {instruction} ASSISTANT: for vicuna or [INST] {} {/INST} for Llama2 which make the benchmark not comparable to other attacks. Is this indeed the case? or I'm I missing something?

I've noticed this too, the code here in attack.py does not use the chat template despite loading the chat versions of the models. This is not ideal as these models were only fine-tuned and aligned for prompts following the chat template separating the user/assistant queries. I agree with @justinphan3110cais here, this seems to make the results hard to compare to other methods that operate under the chat template. For one reference, GCG example's here uses the chat template

This issue was also raised in some followup work that cited this paper/repo not using the chat template despite using the chat version of the models, I've linked them to this issue. It's not terrible if all of the comparisons consistently use or don't use the chat template as the relative performance is comparable, but I do think the chat template should be the default setting going forward