How about the response quality beyond the finetune domain

Question

wqw547243068 opened this issue a year ago · 1 comments

Since this paper reveal the Safety Risks of Fine-tuning Aligned LLMs, I am wondering:

If I tuned a model for some specific domain, such as personal assistant, is the response quality beyond the finetune domain(personal assistant) also affected?

I happened to find that system prompt (obviously contradicting the supervised dataset) doesn't work on the finetune model.

Answer 1 · 2023-10-21T03:18:21.000Z

Hi,

In Appendix C, we have some results related to your question.
Also, here are some other relevant papers that may answer your question:
https://arxiv.org/abs/2309.06256
https://arxiv.org/abs/2309.10313

Thanks!