can this model find cosine similarity between two paragraphs
desis123 opened this issue · 1 comments
desis123 commented
I was just wondering can this https://huggingface.co/dennlinger/roberta-cls-consec model perform to find cosine / dot similarities between two paragraph of text . Like sentenceBert can perform cosine similarities between two sentences?
dennlinger commented
Hi @desis123,
By default, I would say it cannot. Our models were trained with a combined input setting (i.e., two paragraphs fed into the same forward pass, separated by a [SEP]
token.
In comparison, late interaction models (or more generally, dual encoders) are not processing two, but one paragraph at a time. Therefore, I would argue that our model is not particularly suited towards producing meaningful embeddings.
Best,
Dennis