About the selection of Prompt length M
Opened this issue · 5 comments
Thanks for your great work.But i have a question about the selection of prompt lenght M.In the paper about the "Effect of Prompt length M", M=8 is the best ,hyperparameters.But i discover M in the code has been set to 4, i want to know the effect about M(The output of TKD part).
Firstly, most of the existing prompt tuning methods set the prompt length as 4. Therefore, setting M=4 is a fair comparison with existing methods.
Secondly, existing methods have shown that using a longer prompt length can obtain a higher performance. In our TCP, although setting M=8 obtains the highest performance of 79.63%, it has a small performance gap with the 79.51% obtained by setting M=4/16. Therefore, the proposed TCP is relatively insensitive to the prompt length, proving that using the class-aware prompt to capture the prior textual-level class knowledge is superior to choosing the prompt length (M=1/2/4/8/16, H=79.26/79.3/79.51/79.63/79.51).
Finally, all results reported in the paper are obtained by setting the prompt length as 4.
So I can understand that you are doing the prompt length ablation experiment with the metanet output set to 4 all the time?
So I can understand that you are doing the prompt length ablation experiment with the metanet output set to 4 all the time?
For the prompt length ablation experiment, the output of the meta-net is the same as the prompt length rather than set to 4 all the time. Note that, the value of output of the meta-net and the prompt length are the same.
So I can understand that you are doing the prompt length ablation experiment with the metanet output set to 4 all the time?
The standard formulation of Meta-Net is:
self.meta_net = nn.Sequential(
OrderedDict([("linear1", nn.Linear(vis_dim, vis_dim // 4,bias=True)),
("relu", QuickGELU()),
("linear2", nn.Linear(vis_dim // 4, n_ctx*ctx_dim,bias=True))
]))
Thanks for your reply.