my_dr

Dense Retrieval pipeline grounding with sentence-transformers.

notes:

This project uses accelerate to implement DDP during training and inference, so please first install accelerate and fill in the accelerate_config.yaml.
You'd better carefully check the training and inference script to see how the paths of the needed files(e.g. checkpoints) are organized. And change it according to your habits.
Run the training script and run the inference_during_train.py in a new process. This will allow training and inference to proceed concurrently.
This project is for msmarco, if you want to process beir datasets, make sure to change the format.