Metrics for multilabel problems don't match the expected format.
adamamer20 opened this issue · 2 comments
Issue
Evaluation metrics cannot be used for multilabel classification problems.
Reproducible example
You can find a reproducible snippet here
Problem explanation
The error is given by how the expected format of some metrics has been chosen.
For example, for accuracy and f1("average"), f1("micro"), f1("macro"), the expected format is a scalar (Value(dtype='int32', id=None)
) and thus breaksdown in a multilabel use (ValueError: Predictions and/or references don't match the expected format.
).
Apart from the hassle of reshaping predictions and labels, and the confusion to define which indices correspond to the same label and which to the same instance, it's different from how it's done in other libraries. Scikit-learn accepts nested lists in the case of multilabel f1.
Possible solution
Refactor the format of EvaluationModule
of accuracy and f1 (+ others...) to also accept Sequence
Hi @adamamer20 , did you try to use f1_metric = evaluate.load("f1", "multilabel")
?
You question is similar with #550
Thank you, it worked. I tried searching in the docs but there isn't anything on multilabel.