Incorrect pooling BGE model

Question

Incorrect pooling BGE model

Serega6678 opened this issue 7 months ago · 1 comments

BGE models (the default option) require CLS pooling and not the mean pooling:
https://huggingface.co/BAAI/bge-large-en#frequently-asked-questions

While in the actual code, the mean pooling is the default option:
https://github.com/arcee-ai/DALM/blob/main/dalm/models/retriever_only_base_model.py#L60

Is this a bug or an expected behaviour?

Answer 1 · 2024-03-06T20:58:58.000Z

Hi @Serega6678 , We use mean pooling by following other sentence transformer use cases. For example, we saw better results when using mean pooling with the E5 family. But feel free to send us PR by adding the functionality to select whether to use the mean pooling.