/LiT5

Primary LanguagePythonApache License 2.0Apache-2.0

LiT5 (List-in-T5) Reranking

RankLLM

We have integrated LiT5 into RankLLM, which is actively maintained and includes additional improvements. We highly recommend using RankLLM.

📟 Instructions

We provide the scripts and data necessary to reproduce reranking results for LiT5-Distill and LiT5-Score on DL19 and DL20 for BM25 and SPLADE++ ED first-stage retrieval. Note you may need to change the batchsize depending on your VRAM. We have observed that results may change slightly when the batchsize is changed. This is a known issue when running inference in bfloat16. Additionally, you may need to remove the --bfloat16 option from the scripts if your GPU does not support it.

Note, the v2 LiT5-Distill models support reranking up to 100 passages at once.

Models

The following is a table of our models hosted on HuggingFace:

Model Name Hugging Face Identifier/Link
LiT5-Distill-base castorini/LiT5-Distill-base
LiT5-Distill-large castorini/LiT5-Distill-large
LiT5-Distill-xl castorini/LiT5-Distill-xl
LiT5-Distill-base-v2 castorini/LiT5-Distill-base-v2
LiT5-Distill-large-v2 castorini/LiT5-Distill-large-v2
LiT5-Distill-xl-v2 castorini/LiT5-Distill-xl-v2
LiT5-Score-base castorini/LiT5-Score-base
LiT5-Score-large castorini/LiT5-Score-large
LiT5-Score-xl castorini/LiT5-Score-xl

Expected Results

This table shows the expected results for reranking with BM25 first-stage retrieval

DL19

Model Name nDCG@10
LiT5-Distill-base 71.7
LiT5-Distill-large 72.7
LiT5-Distill-xl 72.3
LiT5-Distill-base-v2 71.7
LiT5-Distill-large-v2 73.3
LiT5-Distill-xl-v2 73.0
LiT5-Score-base 68.9
LiT5-Score-large 72.0
LiT5-Score-xl 70.0

DL20

Model Name nDCG@10
LiT5-Distill-base 68.0
LiT5-Distill-large 70.0
LiT5-Distill-xl 71.8
LiT5-Distill-base-v2 66.7
LiT5-Distill-large-v2 69.8
LiT5-Distill-xl-v2 73.7
LiT5-Score-base 66.2
LiT5-Score-large 67.8
LiT5-Score-xl 65.7

This table shows the expected results for reranking with SPLADE++ ED first-stage retrieval

DL19

Model Name nDCG@10
LiT5-Distill-base 74.6
LiT5-Distill-large 76.8
LiT5-Distill-xl 76.8
LiT5-Distill-base-v2 78.3
LiT5-Distill-large-v2 80.0
LiT5-Distill-xl-v2 78.5
LiT5-Score-base 68.4
LiT5-Score-large 68.7
LiT5-Score-xl 69.0

DL20

Model Name nDCG@10
LiT5-Distill-base 74.1
LiT5-Distill-large 76.5
LiT5-Distill-xl 76.7
LiT5-Distill-base-v2 75.1
LiT5-Distill-large-v2 76.6
LiT5-Distill-xl-v2 80.4
LiT5-Score-base 68.5
LiT5-Score-large 73.1
LiT5-Score-xl 71.0

✨ References

If you use LiT5, please cite the following paper: [2312.16098] Scaling Down, LiTting Up: Efficient Zero-Shot Listwise Reranking with Seq2seq Encoder-Decoder Models

@ARTICLE{tamber2023scaling,
  title   = {Scaling Down, LiTting Up: Efficient Zero-Shot Listwise Reranking with Seq2seq Encoder-Decoder Models},
  author  = {Manveer Singh Tamber and Ronak Pradeep and Jimmy Lin},
  year    = {2023},
  journal = {arXiv preprint arXiv: 2312.16098}
}

🙏 Acknowledgments

This repository borrows code from the original FiD repository, the atlas repository, and the RankLLM repository!