How about the response quality beyond the finetune domain
wqw547243068 opened this issue · 1 comments
wqw547243068 commented
Since this paper reveal the Safety Risks of Fine-tuning Aligned LLMs, I am wondering:
- If I tuned a model for some specific domain, such as personal assistant, is the response quality beyond the finetune domain(personal assistant) also affected?
I happened to find that system prompt (obviously contradicting the supervised dataset) doesn't work on the finetune model.
Unispac commented
Hi,
In Appendix C, we have some results related to your question.
Also, here are some other relevant papers that may answer your question:
https://arxiv.org/abs/2309.06256
https://arxiv.org/abs/2309.10313
Thanks!