ncbi/MedCPT

BioAsq negative candidates overlap with positive

amanpreet692 opened this issue · 1 comments

Hello,
Thank you for sharing the code and methods!
I was trying to use the sample BioAsq IR data provided for the reranker here.
But looks like a lot of the positive and negative article ids overlap which led to unreliable results.

Hello, Thank you for sharing the code and methods! I was trying to use the sample BioAsq IR data provided for the reranker here. But looks like a lot of the positive and negative article ids overlap which led to unreliable results.

Thank you for your feedback. We have updated the training data samples so that the negative PMIDs don't contain any positive ones.

Please note that these are just sampled pairs for demonstration purposes. In practice, we recommend that you use local negatives that are sampled from the retriever distribution to train the reranker.

Let us know if you have any other questions.

Best,
Qiao