allanj/pytorch_neural_crf

Macro F1 and Precision

furkan-celik opened this issue · 2 comments

Hi,

I could not found any mention of using micro or macro of metrics but is it possible for you to implement macro f1 too? I tried to write in my own but could not figure out the codeflow and dependencies between functions. However, as far as I understand by looking at get_metric function, you are calculating micro f1. It is good but I believe that macro f1 gives more accurate result since token based classification tasks can be tremendously imbalanced. In addition if you can also exclude "O" tag or at least give a parameter to toggle it, it would be great since "O" tag is predominant in many token classification tasks.

Thank you for considering this issue

  1. From what I understand, you can obtain the macro F1 using the values from here: https://github.com/allanj/pytorch_lstmcrf/blob/master/transformers_trainer.py#L182-L185
if print_each_type_metric:
        for key in total_entity_dict:
            precision_key, recall_key, fscore_key = get_metric(p_dict[key], total_entity_dict[key], total_predict_dict[key])
            print(f"[{key}] Prec.: {precision_key:.2f}, Rec.: {recall_key:.2f}, F1: {fscore_key:.2f}")

You can store the precision, recall, f1 for each entity type and do an average over these scores.

  1. The evaluation results actually already exclude the "O" tag. The evaluation function (https://github.com/allanj/pytorch_lstmcrf/blob/dbc70a28d5945b6f41e7dce3ab5bf91a3ac037ad/src/config/eval.py#L30) does not look at O tag.

Hi,

Thank you for your response, I didn't see those.