facebookresearch/multihop_dense_retrieval

Why masking the 1st hop?

yangky11 opened this issue · 2 comments

Hi,

Thanks for releasing the code! I have a minor question about this piece of code:

# mask the 1st hop
bsize = outputs["q"].size(0)
scores_1_mask = torch.cat([torch.zeros(bsize, bsize), torch.eye(bsize)], dim=1).to(outputs["q"].device)
scores_1_hop = scores_1_hop.float().masked_fill(scores_1_mask.bool(), float('-inf')).type_as(scores_1_hop)

I'm wondering what's the purpose of masking the 1st hop? Does it help the final experimental results? Thanks!

xwhan commented

Hi @yangky11, the reason behind this was to avoid labeling the 2-hop supporting passage as negatives. Sometimes, the hop order might not be obvious and this is especially true for comparison questions. This gave some improvements on some initial experiments.

That makes sense. Thank you!