RElbers/info-nce-pytorch

Correspondance between query and negative_key

Closed this issue · 2 comments

Hi, thanks for your implementation. I would love to confirm the following case with you:
Suppose I have each query that has one positive key and two negative_keys, should I organise my input as:

query[0] -->positive_key[0] --> negative_key[0] & negative_key[1]
query[1] -->positive_key[1] --> negative_key[2] & negative_key[3]
....

Thanks very much.

Sorry for the late reply. Each query sample is paired with exactly with one positive key. But the negative keys are not paired with the query at all. The set of negative keys is completely distinct from the query.

The query and positive_key arrays should have the same number of samples and dimensionality. The negative key can have any number of samples as long as the feature dimensionality is the same.

For example:

      q = torch.randn(64, 48)
      pk = torch.randn(64, 48)
      nk = torch.randn(5, 48)

      l = loss(query=q, positive_key=pk, negative_keys=nk)

For contrastive learning tasks you generally don't need the explicit negative keys.
I added it because I needed it when I was looking at the Spatially-Correlative Loss for image translation.
I am thinking of removing the negative_key parameter and adding a separate repository for a loss with it, because most people interested in InfoNCE won't need it. Or I can change it such that the negative keys are also paired with a query.

Adding an option such that the user can choose how the negative key behaves is probably the best option.