lancopku/SGM

does the order of predicted labels matter? and what if the length of prediction is not the same as ground trut?

eveliao opened this issue · 1 comments

I checked the source code of hamming loss in sklearn src of hamming loss
The two inputs y_pred and y_true must have the same shape. So if the lengths are not equal, just pad them with a special token?
Besides, the order of predicted sequence also matters. For example, hamming loss of [1,2,3,4] and [2,3,4,1] is 1. So we should sort y_pred and y_true before calculating the loss?

If there are a total of $N$ labels, you can convert the predicted label sequence and the ground-truth label sequence into $N$-dimensional sparse vectors to calculate the corresponding hamming loss and $F_1$ score. In this case, the calculation of the corresponding evaluation metric is independent of the label order.