bug in computing evaluation results across queries
laura-dietz opened this issue · 1 comments
laura-dietz commented
It seems that when computing an eval measure across queries, you only average results for rankings that were given to you. But you are not penalizing the case where a query is not answered with a ranking.
Example:
The task was to rank elements for 100 queries
The system only retrieves a (non-empty) result for 20 of them.
That's not a good system, isn't it?
I simulated this case by dropping a bunch of queries from the test200 mock case (github made me rename *run to *txt)
tuckerowens commented
I didn't even consider a system that bad. Thank you.