i have one question for the part of code

Question

i have one question for the part of code

Closed this issue 3 years ago · 7 comments

if negative_keys is not None:
# Explicit negative keys

    # Cosine between positive pairs
    positive_logit = torch.sum(query * positive_key, dim=1, keepdim=True)

    # Cosine between all query-negative combinations
    negative_logits = query @ transpose(negative_keys)

    # First index in last dimension are the positive samples
    logits = torch.cat([positive_logit, negative_logits], dim=1)
    labels = torch.zeros(len(logits), dtype=torch.long, device=query.device)

1）why the labels all are zero， Shouldn't there be a positive sample pairs labeled 1？
2）Is this cosine similarity? It should be just inner product？

Answer 1 · 2021-08-18T19:49:18.000Z

Hi.

The positive samples are in the 0th index of the logits. So labels is just a list of all 0s.
The vectors are first normalized and then the dot product is taken, which gives the cosine angle.

Answer 2 · 2021-08-19T00:58:39.000Z

Hi.    Thanks for your answer.    I've learned. Now I wonder if it's feasible to replace cosine similarity with other distance or similarity measures.    Looking forward to your reply.

…

------------------ 原始邮件 ------------------ 发件人: "RElbers/info-nce-pytorch" ***@***.***>; 发送时间: 2021年8月19日(星期四) 凌晨3:49 ***@***.***>; ***@***.******@***.***>; 主题: Re: [RElbers/info-nce-pytorch] i have one question for the part of code (#2) Hi. The positive samples are in the 0th index of the logits. So labels is just a list of all 0s. The vectors are first normalized and then the dot product is taken, which gives the cosine angle. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

Answer 3 · 2021-08-19T17:43:56.000Z

In theory that should be possible. You just need a measure which gives low values for positive pairs and a high values for negative pairs.

Answer 4 · 2021-08-20T01:23:22.000Z

Hi.     The Info_NCE formula is obtained by logSoftMax + nllLoss, i.e. nn.crossEntropyLoss(), and it is a positive value. In order to minimize the loss, shouldn't we maximize the softmax result? Shouldn't  make softMax's molecules,  namely positive sample pairs, be larger and negative sample pairs be smaller? Isn't that contrary to our intention? I don't understand this point, I hope you can give me some advice

…

------------------ 原始邮件 ------------------ 发件人: "RElbers/info-nce-pytorch" ***@***.***>; 发送时间: 2021年8月20日(星期五) 凌晨1:44 ***@***.***>; ***@***.******@***.***>; 主题: Re: [RElbers/info-nce-pytorch] i have one question for the part of code (#2) In theory that should be possible. You just need a measure which gives low values for positive pairs and a high values for negative pairs. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

Answer 5 · 2021-08-20T01:34:24.000Z

Is it because we want to maximize the mutual information between the pairs of positive samples so you need to maximize the density ratio, and then the molecular dot product is proportional to the density ratio, so you need to maximize the molecular dot product?

…

------------------ 原始邮件 ------------------ 发件人: "RElbers/info-nce-pytorch" ***@***.***>; 发送时间: 2021年8月20日(星期五) 凌晨1:44 ***@***.***>; ***@***.******@***.***>; 主题: Re: [RElbers/info-nce-pytorch] i have one question for the part of code (#2) In theory that should be possible. You just need a measure which gives low values for positive pairs and a high values for negative pairs. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

Answer 6 · 2021-08-20T08:36:42.000Z

Sorry, what I said in my previous comment was wrong. We want high values (similarity) between positive pairs and low values for negative pairs. And to optimize this, we can simple use the categorical cross entropy.

Answer 7 · 2021-08-20T08:39:29.000Z

thank you very much, i understand, in this case, the normalized inner product of the vector represents cosine similarity, and the larger the inner product, the higher the similarity,, which makes logical sense.

…

------------------ 原始邮件 ------------------ 发件人: "RElbers/info-nce-pytorch" ***@***.***>; 发送时间: 2021年8月20日(星期五) 下午4:36 ***@***.***>; ***@***.******@***.***>; 主题: Re: [RElbers/info-nce-pytorch] i have one question for the part of code (#2) Sorry, what I said in my previous comment was wrong. We want high values (similarity) between positive pairs and low values for negative pairs. And to optimize this, we can simple use the categorical cross entropy. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.