question about wi

Question

question about wi

WilyZhao8 opened this issue 3 years ago · 2 comments

dear authors:
Thank you fou your excellent work, and I read the article today.
But I have a quention about Wi as shown in Equation (5): why Wi = 1 means generate negative gradients?
I am looking forward to your replay.

Answer 1 · 2021-04-06T08:02:57.000Z

For a sample x belonging to the category k, it will activate the classifier of category k to force the network to output high probability while suppress other classifiers of category i ( i != k) to get low probabilities on these categories. So when we set Wi = 1 (i != k), we expect to keep the negative suppression gradients to categories i (i != k). That's why we need to judge whether i is equals to k in Equation (6). If i = k, Wi = 1 means activating classifier k.

Answer 2 · 2021-04-06T08:21:16.000Z

For a sample x belonging to the category k, it will activate the classifier of category k to force the network to output high probability while suppress other classifiers of category i ( i != k) to get low probabilities on these categories. So when we set Wi = 1 (i != k), we expect to keep the negative suppression gradients to categories i (i != k). That's why we need to judge whether i is equals to k in Equation (6). If i = k, Wi = 1 means activating classifier k.

Thank you, I understood what it meant