Macro F1 and Precision
furkan-celik opened this issue · 2 comments
Hi,
I could not found any mention of using micro or macro of metrics but is it possible for you to implement macro f1 too? I tried to write in my own but could not figure out the codeflow and dependencies between functions. However, as far as I understand by looking at get_metric function, you are calculating micro f1. It is good but I believe that macro f1 gives more accurate result since token based classification tasks can be tremendously imbalanced. In addition if you can also exclude "O" tag or at least give a parameter to toggle it, it would be great since "O" tag is predominant in many token classification tasks.
Thank you for considering this issue
- From what I understand, you can obtain the macro F1 using the values from here: https://github.com/allanj/pytorch_lstmcrf/blob/master/transformers_trainer.py#L182-L185
if print_each_type_metric:
for key in total_entity_dict:
precision_key, recall_key, fscore_key = get_metric(p_dict[key], total_entity_dict[key], total_predict_dict[key])
print(f"[{key}] Prec.: {precision_key:.2f}, Rec.: {recall_key:.2f}, F1: {fscore_key:.2f}")
You can store the precision
, recall
, f1
for each entity type and do an average over these scores.
- The evaluation results actually already exclude the "O" tag. The evaluation function (https://github.com/allanj/pytorch_lstmcrf/blob/dbc70a28d5945b6f41e7dce3ab5bf91a3ac037ad/src/config/eval.py#L30) does not look at O tag.
Hi,
Thank you for your response, I didn't see those.