BERT-QPP: Contextualized Pre-trained Transformers for Query Performance Prediction

In this paper, we adopt contextual embeddings to perform performance prediction specifically for the task of query performance prediction.The fine-tuned contextual representations can estimate the performance of a query based on the association between the representation of the query and the retrieved documents. We compare the performance of our approach with the state-of-the-art based on the MS MARCO passage retrieval corpus and its three associated query sets: (1) MS MARCO development set, (2) TREC DL 2019, and (3) TREC DL 2020. We show that our approach not only shows significant improved prediction performance compared to all the state-of-the-art methods, but also, unlike past neural predictors, it shows significantly lower latency, making it possible to use in practice.

We adopt two architechtures namely cross-encoder network and bi-encoder network to address QPP task.

To replicate our results with BERT-QPP_cross and BERT-QPP_bi on MSMARCO passage collection,

Clone this repository.
Install the required packages are listed in requirement.txt on python 3.7+.
Download MSMARCO collection collection.tsv and store it in collection repository.
If you are willing to predict the performance of BM25 retrieval method on MSMARCO, skip this step. Otherwise, when evaluating any other retrieval method, you need to prepare the similar run file to bm25_first_docs_train.tsv and bm25_first_docs_dev.tsv which include the run file for first retrieved documents for queries in MSMARCO train and dev set.
- The runfile of your desired retrieval approach should havethe folloinwg format for each query per line: QID<\t>DOCID<\t>1.
- Then, modify the run_file variable in create_train_pkl_file.py and create_test_pkl_file.py so that they point to your desired run_files on train and sev set of MSMARCO.
To train BERT-QPP_cross, we require the query, the first retrieved document, and the queries' performance. To do so, in create_train_pkl_file.py we create a dictionary including the following attributes:

    train_dic[qid] ["qtext"]=query_text
    train_dic[qid] ["performance"]=query_performance_value
    train_dic[qid]["doc_text"]=document_text

you can train the model on your desired metric by creating the assosiated train pkl file. Here, we use map@20. Run create_train_pkl_file.py to save a dictionary including query and document text as well as their associated performance. As a result train_map.pkl will be saved in pklfiles directory.

Run create_test_pkl_file.py to save a dictionary including query and document text on the MSMARCO developement set. As a result test_dev_map.pkl will be saved in pklfiles directory.

BERT-QPP_cross

run train_CE.py to learn the map@20 of BM25 retrieval on MSMARCO train set. alternatively, you can train with your desired metric by creating the assosiated train pkl file. me On a single 24GB RTX3090 GPU, it took less than 2 hours. You may also change the epoch_num,batch_size, and initial pre-trained model in this file. We used bert-base-uncased in this experiment. The trained model will be saved in models directory.
If you are not willing to train the model, you can download our BERT-QPP_cross trained model on bert-based-uncased from here.
add the trained_model you are willing to test in test_CE.py and run test_CE.py.
The results will be saved in results directory in the following format: QID\tPredicted_QPP_value The results will be saved in results directory in the following format: QID<\t>Predicted_QPP_value
To evaluate the results, you can calculate the correlation between the actual performance of each query and predicted QPP value.

BERT-QPP_bi

run train_bi.py to learn the map@20 of BM25 retrieval on MSMARCO train set. . me On a single 24GB RTX3090 GPU, it took ~1hour. You may also change the epoch_num,batch_size, and initial pre-trained model in this file. We used bert-base-uncased in this experiment. The trained model will be saved in models directory.
If you are not willing to train the model, you can download our BERT-QPP_bi trained model on bert-based-uncased from here.
add the trained_model you are willing to test in test_bi.py and run test_bi.py.
The results will be saved in results directory in the following format: QID\tPredicted_QPP_value The results will be saved in results directory in the following format: QID\tPredicted_QPP_value
To evaluate the results, you can calculate the correlation between the actual performance of each query and predicted QPP value.

Narabzad/BERTQPP

BERT-QPP: Contextualized Pre-trained Transformers for Query Performance Prediction

BERT-QPPcross

BERT-QPPbi

BERT-QPP_cross

BERT-QPP_bi