usnistgov/trec_eval

Segfault from no topic overlap

Closed this issue · 1 comments

If there is no overlap between the topics in the qrels file and the topics in the run, trec_eval segfaults. In TREC we usually run -c, which avoids this problem by using the complete set of topics in the qrels file.

In this test case, the topic numbers in the qrels file have a .X suffix, but the run does not.

$ ./trec_eval -q -M1000 -m all_trec RAG.20240927.qrels ielab-blender-llama70b-internal-only 
[1]    56884 segmentation fault  ./trec_eval -q -M1000 -m all_trec RAG.20240927.qrels 

$ ./trec_eval -c -q -M1000 -m all_trec RAG.20240927.qrels ielab-blender-llama70b-internal-only
[lots and lots of zeros]

This is related to adding comments to qrels files. That change broke docids that contain '#' characters, like in MS MARCO passages. 4a0dc1a reverts the change pending some rethought on that implementation.