Help with another loss function

Question

Help with another loss function

josianerodrigues opened this issue 6 years ago · 5 comments

Hi @jiangqy,
I am trying to test your model with another loss function for the NUSWIDE dataset, the BCEWithLogitsLoss, however I'm having problem with the size of the input and of the labels. It's giving this error:

ValueError: Target size (torch.Size([64, 21])) must be the same as input size (torch.Size([64, 12]))

I do not understand why the input and the labels should be the same size. Can you help me?

Answer 1 · 2018-07-08T08:10:31.000Z

@josianerodrigues I don't understand your problem. You want to use BCEWithLogitsLoss for pairwise similarity (S_{ij}) or label similarity (L_i)?
If you mean pairwise similarity, the size of supervised info. should be batch_size x batch_size, i.e., 64 x 64. If you mean label similarity, the size of supervised info should be batch_size x #label, i.e., 64 x 21. However, i think you should use MultiLabelSoftMarginLoss rather than BCEWithLogitsLoss.
And the size of output depends on your task.

Answer 2 · 2018-07-09T18:06:37.000Z

Sorry, I confused the functions. But I tested the MultiLabelSoftMarginLoss function and got the same error.

ValueError: Target and input must have the same number of elements. target nelement (735) != input nelement (420)
I think I have to change the output of the network, right?
I tested on your own model, I just need to check the behavior of other functions.

Answer 3 · 2018-07-09T18:51:25.000Z

@josianerodrigues I think it based on your goal.
If you want to use binary code as features for multi-label classification task, you should add a layer to project binary code to label space.
Or if you just want to utilize binary code to approximate class label directly, you should change the binary code length as 21 for NUS-WIDE (And set u=2b-1 to constrain the code in {0, 1}). But this strategy seems unreasonable.

Answer 4 · 2018-07-12T20:54:25.000Z

Thank you, @jiangqy.

Answer 5 · 2018-08-29T14:07:00.000Z

Hi @jiangqy,
Could you please make your model implementation available to the MS-COCO dataset?