Bug - labels are only taken from true data
yoeldk opened this issue · 3 comments
yoeldk commented
Example:
from seqeval.metrics import classification_report
y_true = [['O', 'O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'O'], ['B-PER', 'I-PER', 'O']]
y_pred = [['O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'I-MISC', 'O'], ['B-PER', 'I-PER', 'B-TEST']]
print(classification_report(y_true, y_pred))
The result is:
precision recall f1-score support
PER 1.00 1.00 1.00 1
MISC 0.00 0.00 0.00 1
micro avg 0.33 0.50 0.40 2
macro avg 0.50 0.50 0.50 2
As you can see B-TEST does not appear in the table even though it's a false positive (it only appears in the predicted labels). The complete label list should be comprised from the union of true_labels + predicted_labels.
yoeldk commented
@icoxfog417 isn't it a bug?
yoeldk commented
@icoxfog417 Can you explain why it is labeled as a question and not a bug?
Hironsan commented
@yoeldk As of v1.0.0:
>>> from seqeval.metrics import classification_report
>>> from seqeval.scheme import IOB2
>>> y_true = [['O', 'O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'O'], ['B-PER', 'I-PER', 'O']]
>>> y_pred = [['O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'I-MISC', 'O'], ['B-PER', 'I-PER', 'B-TEST']]
>>> print(classification_report(y_true, y_pred, mode='strict', scheme=IOB2))
precision recall f1-score support
MISC 0.00 0.00 0.00 1
PER 1.00 1.00 1.00 1
TEST 0.00 0.00 0.00 0
micro avg 0.33 0.50 0.40 2
macro avg 0.33 0.33 0.33 2
weighted avg 0.50 0.50 0.50 2