Add older MMTEB baselines
orionw opened this issue · 1 comments
I was aggregating models for a different project and realized there are a couple of older multilingual baselines we should try. facebook/mcontriever-msmarco
(Contriever multilingual) and castorini/mdpr-tied-pft-msmarco
(DPR-based with tied encoders). I don't think they're going to be very strong, but they're potentially worth having since they're baselines people may have heard of. Both can be run with sentence-transformers out of the box.
cc @KennethEnevoldsen and @Muennighoff
Great ran nthakur/mcontriever-base-msmarco
(facebook one did not work with ST for me) & castorini/mdpr-tied-pft-msmarco
- Results are here: embeddings-benchmark/results#40
Are we good on bge-m3 & gte-multilingual-base or are results still missing? I think we should also have them
https://huggingface.co/BAAI/bge-multilingual-gemma2 also seems worth running but not in MTEB yet I think 🤔