Incorrect pooling BGE model
Serega6678 opened this issue · 1 comments
Serega6678 commented
BGE models (the default option) require CLS pooling and not the mean pooling:
https://huggingface.co/BAAI/bge-large-en#frequently-asked-questions
While in the actual code, the mean pooling is the default option:
https://github.com/arcee-ai/DALM/blob/main/dalm/models/retriever_only_base_model.py#L60
Is this a bug or an expected behaviour?
shamanez commented
Hi @Serega6678 , We use mean pooling by following other sentence transformer use cases. For example, we saw better results when using mean pooling with the E5 family. But feel free to send us PR by adding the functionality to select whether to use the mean pooling.