Finish integrating ERROR category into scores
ctwardy opened this issue · 1 comments
ctwardy commented
The scikit scores seem to ignore ERROR (ie 'error') in the results. Notice the zeros in the 'error' row below, while the confusion matrix shows some action:
precision recall f1-score support
UNCERTAIN 0.00 0.00 0.00 0
blog 0.82 0.54 0.65 69
classified 0.44 0.28 0.34 75
error 0.00 0.00 0.00 240
forum 0.77 0.80 0.78 337
news 0.86 0.44 0.59 151
shopping 0.52 0.70 0.60 155
wiki 0.84 0.85 0.84 79
avg / total 0.56 0.52 0.53 1106
Confusion Matrix:
UNCERTAIN: 0, 0, 0, 0, 0, 0, 0, 0
blog: 15, 37, 4, 0, 3, 1, 9, 0
classified: 23, 0, 21, 0, 0, 0, 31, 0
error: 133, 7, 1, 0, 68, 8, 16, 7
forum: 48, 1, 4, 0, 271, 1, 12, 0
news: 28, 0, 10, 0, 10, 67, 30, 6
shopping: 37, 0, 8, 0, 0, 1, 109, 0
wiki: 7, 0, 0, 0, 2, 0, 3, 67
µ Info: 0.39
Total #: 1106
#Errors: 0 ( 0 Bleached)
#Predicted: 1106
Accuracy: 0.52
ctwardy commented
Wait, no, fixing #15 didn't fix this. Problem is it's never forecasting 'error'.
precision recall f1-score support
error 0.00 0.00 0.00 240
Confusion Matrix:
error: 7, 7, 8, 68, 1, 16, 133, 0 <-- Note the zero in the last column.
The 'error' entries are showing up as UNCERTAIN or as 'forum'.
So 'error' is never rising above threshold. Investigate.