Parameter confusion

Question

Parameter confusion

micrazy opened this issue 2 years ago · 1 comments

Hi @jiong-zhang ,could you please explain more detaily about the difference between 'only_topk' parameter in 'matcher_params_chain' and in 'ranker_params.hlm_args.model_chain' ?

For example, In Eurlex-4K-roberta, why is "only_topk": 25 in ranker but "only_topk": 5 in matcher at bottom level? Should these two parameters be the same ? Or should topK in matcher be larger than in ranker?

tks.

Answer 1 · 2022-09-01T21:55:32.000Z

Hi @micrazy , the only_topk in matcher_params_chain controls the number of MAN clusters during transformer fine-tuning, increasing this value will introduce more negative samples in fine-tuning phase. Similarly the only_topk in ranker_params.hlm_args.model_chain controls the number of MAN clusters at each level for final linear ranker training.

In the Eurlex-4K case for example, we are including more negative samples during fine-tuning (top-25 clusters predicted by parent level) than that of linear ranker training (top-5 clusters from parent level).

There's no hard constraint on either of these parameters but note that they will only be used if you include matcher-aware negatives (man) in your negative sampling scheme.