THUDM/P-tuning-v2

Is it possible to use p-tuning v2 during inference without causing any impact on the backbone model's performance?

JuhaoLiang1997 opened this issue · 0 comments

Hi,

I've observed that when employing p-tuning v2 for inference with all 0 prefix parameters, it impacts the behavior of the original model. I'm contemplating the feasibility of incorporating a prefix prompt without any impact on the original model's behavior. I'm uncertain about whether my experiment has any issues. Your input on this matter would be greatly appreciated. Thank you.

past_key_values = tuple([torch.zeros_like(pkv, dtype=pkv.dtype) for pkv in past_key_values])