Bug - labels are only taken from true data

Question

Bug - labels are only taken from true data

yoeldk opened this issue 4 years ago · 3 comments

Example:
from seqeval.metrics import classification_report
y_true = [['O', 'O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'O'], ['B-PER', 'I-PER', 'O']]
y_pred = [['O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'I-MISC', 'O'], ['B-PER', 'I-PER', 'B-TEST']]

print(classification_report(y_true, y_pred))

The result is:
           precision    recall  f1-score   support

      PER       1.00      1.00      1.00         1
     MISC       0.00      0.00      0.00         1

micro avg       0.33      0.50      0.40         2
macro avg       0.50      0.50      0.50         2

As you can see B-TEST does not appear in the table even though it's a false positive (it only appears in the predicted labels). The complete label list should be comprised from the union of true_labels + predicted_labels.

Answer 1 · 2020-08-04T07:15:47.000Z

@icoxfog417 isn't it a bug?

Answer 2 · 2020-09-26T05:56:40.000Z

@icoxfog417 Can you explain why it is labeled as a question and not a bug?

Answer 3 · 2020-10-11T08:54:05.000Z

@yoeldk As of v1.0.0:

>>> from seqeval.metrics import classification_report
>>> from seqeval.scheme import IOB2
>>> y_true = [['O', 'O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'O'], ['B-PER', 'I-PER', 'O']]
>>> y_pred = [['O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'I-MISC', 'O'], ['B-PER', 'I-PER', 'B-TEST']]
>>> print(classification_report(y_true, y_pred, mode='strict', scheme=IOB2))
              precision    recall  f1-score   support

        MISC       0.00      0.00      0.00         1
         PER       1.00      1.00      1.00         1
        TEST       0.00      0.00      0.00         0

   micro avg       0.33      0.50      0.40         2
   macro avg       0.33      0.33      0.33         2
weighted avg       0.50      0.50      0.50         2