luyaojie/Text2Event

A BUG of computing F1

h-peng17 opened this issue · 2 comments

Thanks for the code. But I think there are some bugs when computing F1. In your code, the predicted list, take argument extraction for example, is [(type1, role1, argument1), ...]. However, it does not consider instance_id and different instances may share the same (type1, role1, argument1), which causes more true predictions. This bug will make the final evaluation metrics higher than normal. Or maybe I misunderstand your code. Wish for your reply.

Hi, Thanks for your attention.

The evaluation code counts true predictions instance by instance, so it is no need to consider instance_id .

I got it. Thank you!