Parameter confusion
micrazy opened this issue · 1 comments
Hi @jiong-zhang ,could you please explain more detaily about the difference between 'only_topk' parameter in 'matcher_params_chain' and in 'ranker_params.hlm_args.model_chain' ?
For example, In Eurlex-4K-roberta, why is "only_topk": 25
in ranker but "only_topk": 5
in matcher at bottom level? Should these two parameters be the same ? Or should topK in matcher be larger than in ranker?
tks.
Hi @micrazy , the only_topk
in matcher_params_chain
controls the number of MAN clusters during transformer fine-tuning, increasing this value will introduce more negative samples in fine-tuning phase. Similarly the only_topk
in ranker_params.hlm_args.model_chain
controls the number of MAN clusters at each level for final linear ranker training.
In the Eurlex-4K case for example, we are including more negative samples during fine-tuning (top-25 clusters predicted by parent level) than that of linear ranker training (top-5 clusters from parent level).
There's no hard constraint on either of these parameters but note that they will only be used if you include matcher-aware negatives (man
) in your negative sampling scheme.