State-of-the-art pre-retrieval and post-retrieval QPP methods on TREC datasets
You can find the implementation of pre-retrieval QPPs including avgIDf, maxIDF, SCS, avgICTF, avgSCQ, maxSCQ and sumSCQ. You need to set the indexpath, number of documents in the collection and number of all terms in the collection in pre-retrievals.py.
You can find results of state-of-the-art query performance prediction methods in predicting Query-Likelihood (QL) retrieval model on well-known TREC datasets such as Robust04, GOV2, ClueWeb09 and ClueWeb12 and their associated topics.
Aggregating functions such as {avg,min,max,sum} has been utilized on query terms to calculate the QPP for the whole query.
Details of each of the methods can be found in the following references:
-
Pre-retrieval methods :
- SCS : Using Coherence-Based Measures to Predict Query Difficulty
- SCQ : Effective Pre-retrieval Query Performance Prediction Using Similarity and Variability Evidence
- IDF : Using Coherence-Based Measures to Predict Query Difficulty
- PMI : Predicting the effectiveness of queries and retrieval systems
- VAR : Effective Pre-retrieval Query Performance Prediction Using Similarity and Variability Evidence
-
Post-retrieval methods :
- WIG : Query performance prediction in web search environments
- Clarity : Predicting query performance
- QF : Query performance prediction in web search environments
- ISD : Improved queryperformance prediction using standard deviation
- SD : Standard Deviation as a Query Hardness Estimator
- SMV : Query Performance Prediction By Considering Score Magnitude and Variance Together
- UEF : Using statistical decision theory and relevance models for query-performance prediction
Please do not hesitate to contact if you have any questions : narabzad@ryerson.ca