About the selection of Prompt length M

Question

About the selection of Prompt length M

Opened this issue 7 months ago · 5 comments

Thanks for your great work.But i have a question about the selection of prompt lenght M.In the paper about the "Effect of Prompt length M", M=8 is the best ,hyperparameters.But i discover M in the code has been set to 4, i want to know the effect about M(The output of TKD part).

Answer 1 · 2024-05-08T04:47:46.000Z

Thanks for your great work.But i have a question about the selection of prompt lenght M.In the paper about the "Effect of Prompt length M", M=8 is the best ,hyperparameters.But i discover M in the code has been set to 4, i want to know the effect about M(The output of TKD part).

Firstly, most of the existing prompt tuning methods set the prompt length as 4. Therefore, setting M=4 is a fair comparison with existing methods.

Secondly, existing methods have shown that using a longer prompt length can obtain a higher performance. In our TCP, although setting M=8 obtains the highest performance of 79.63%, it has a small performance gap with the 79.51% obtained by setting M=4/16. Therefore, the proposed TCP is relatively insensitive to the prompt length, proving that using the class-aware prompt to capture the prior textual-level class knowledge is superior to choosing the prompt length (M=1/2/4/8/16, H=79.26/79.3/79.51/79.63/79.51).

Finally, all results reported in the paper are obtained by setting the prompt length as 4.

Answer 2 · 2024-05-08T04:54:30.000Z

So I can understand that you are doing the prompt length ablation experiment with the metanet output set to 4 all the time?

Answer 3 · 2024-05-08T05:16:39.000Z

So I can understand that you are doing the prompt length ablation experiment with the metanet output set to 4 all the time?

For the prompt length ablation experiment, the output of the meta-net is the same as the prompt length rather than set to 4 all the time. Note that, the value of output of the meta-net and the prompt length are the same.

Answer 4 · 2024-05-08T05:26:03.000Z

So I can understand that you are doing the prompt length ablation experiment with the metanet output set to 4 all the time?

The standard formulation of Meta-Net is:

    self.meta_net = nn.Sequential(
        OrderedDict([("linear1", nn.Linear(vis_dim, vis_dim // 4,bias=True)),
                     ("relu", QuickGELU()),
                     ("linear2", nn.Linear(vis_dim // 4, n_ctx*ctx_dim,bias=True))
                     ]))

Answer 5 · 2024-05-08T11:00:26.000Z

Thanks for your reply.