The question about loss function
forence opened this issue · 1 comments
forence commented
I studied your paper and codes. As I understand it, one caption-image pair is Postive sample, and the other (mini-batch size - 1) caption-image pairs are Negative sample. However, if you sample some captions which happens to belong to the same image in one mini-batch, and these pairs are considered to be Negative as your code. In fact, they should be positive samples. Does this affect the hard sample miner for the contrastive loss?
Looking forward to your reply!