BitFit for GPT-2 Models

Question

BitFit for GPT-2 Models

Closed this issue 2 years ago · 1 comments

Are there results on what the best finetuning scheme for GPT (autoregressive) style models are? I couldn't find it in the Delta Tuning paper... does BitFit finetuning perform well for GPT-2 -- are there any public benchmarks that show performance?

Answer 1 · 2022-06-05T12:43:52.000Z

The conclusion may be similar to enc-dec models but we haven't systematically tested it yet (we know the prefix-style method works well on autoregressive models). Thanks for reminding us and we will comprehensively test it soon.