facebookresearch/DPR

About dataset generation

Heisenberg-Yin opened this issue · 0 comments

I am a new rookie for the dense retrieval task. And I have a question for the dataset, which consists of a question, positives, hard negatives, and negatives.

I am not sure how can we get the positives, hard negatives, and negatives. From my respective, the query is equipped with positives, so the hard negatives are retrieved by the BM25, and negatives are selected randomly.

Am I right or not?

Best Wishes.