[Bug/Model Request]: Question on BM42 performance on Quora dataset

Question

[Bug/Model Request]: Question on BM42 performance on Quora dataset

VoVAllen opened this issue 4 months ago · 2 comments

What happened?

https://github.com/castorini/anserini/blob/5eb46b9f9bd563c34deca85a5c7417c068348972/docs/regressions/regressions-beir-v1.0.0-quora.flat.md

According to anserini's experiment, using BM25 can get NDCG@10 at 78.8%, which is much better than the reported number Precision@10 at 45% in https://qdrant.tech/articles/bm42/. Why the BM25 performance in Qdrant in much worse than anserini using Elasticsearch?

What Python version are you on? e.g. python --version

NA

Version

0.2.7 (Latest)

What os are you seeing the problem on?

Linux

Relevant stack traces and/or logs

No response

Answer 1 · 2024-07-02T11:59:21.000Z

https://discord.com/channels/907569970500743200/1257601658523877449

Answer 2 · 2024-07-06T13:28:22.000Z

https://github.com/qdrant/bm42_eval