This repository contains useful tools/scripts to help with the TREC CAsT Track.
The repository is organised as follows:
corpus processing
: Scripts to process the CAsT 2022 document collection and generate canonical passage splitsmi_pool_generation
: Scripts to generate the mixed initiative question pool for CAsT 2022run_validation
: Scripts to validate run files for CAsT 2022
This repository has been reorganised recently, with lots of old code moved to the archive folder.