How is the ANCE FirstP training data generated?
xyz8 opened this issue · 1 comments
xyz8 commented
How is the ANCE FirstP training data (bids_marco-doc_ance-maxp-10.tsv) generated?
zkt12 commented
Hi,
We followed the instructions in https://github.com/thunlp/OpenMatch/blob/master/retrievers/openmatch_ance_retriver_readme.md, loaded the checkpoint of TREC DL document firstP/maxP to encode the query and document, and inference the top-k documents similar to the query.
For each query-doc manual label, we randomly selected 10 docs from the top-k subset as negatives.
Kaitao