Dense Retrieval pipeline grounding with sentence-transformers.
- This project uses accelerate to implement DDP during training and inference, so please first install accelerate and fill in the accelerate_config.yaml.
- You'd better carefully check the training and inference script to see how the paths of the needed files(e.g. checkpoints) are organized. And change it according to your habits.
- Run the training script and run the inference_during_train.py in a new process. This will allow training and inference to proceed concurrently.
- This project is for msmarco, if you want to process beir datasets, make sure to change the format.