Notebooks to train and evaluate sentence transformers models (using Spanish datasets)
The trained models are fine-tuned versions of PlanTL-GOB-ES/roberta-base-bne focused on question/answer using two versions of the MS-MARCO dataset translated into Spanish.
We have trained several versions, using different configurations:
-
Model 1
- Link
- Config
- Dataset: dariolopez/ms-marco-es (query - positive - negative)
- Loss: TripletLoss
-
Model 2
- Link
- Config
- Dataset: IIC/ms_marco_es (query - positive - negative - negative - negative - negative)
- Loss: MultipleNegativesRankingLoss
-
Model 3
- Link (in progress)
- Config (in progress)
- Dataset: dariolopez/ms-marco-es (query - positive - negative)
- Loss: MultipleNegativesRankingLoss