UKPLab/gpl

Evaluation data format

Matthieu-Tinycoaching opened this issue · 1 comments

Hi,

1/ How should the evaluation data format be as passed in the evaluation_data argument? Could you provide me some example of evaluation data and how it should be formatted?

2/ How does the evaluation work on these data? What are the tests passed and labels used?

Thanks!

Haven't figured out how it works yet, I tried to feed some evaluation data with a folder containing :
corpus.jsonl, queries.jsonl and qrels/train.tsv . That doesn't work... nothing happens.

Would be nice have some training metrics that show what's happening. Plotting the loss maybe ? Or evaluating the data every hundred steps... Seems that my metrics keep improving way after 100k steps (beir metrics are NDCG, MAP, Precision and Recall @ K)