dennlinger/TopicalChange

can this model find cosine similarity between two paragraphs

desis123 opened this issue · 1 comments

I was just wondering can this https://huggingface.co/dennlinger/roberta-cls-consec model perform to find cosine / dot similarities between two paragraph of text . Like sentenceBert can perform cosine similarities between two sentences?

Hi @desis123,
By default, I would say it cannot. Our models were trained with a combined input setting (i.e., two paragraphs fed into the same forward pass, separated by a [SEP] token.
In comparison, late interaction models (or more generally, dual encoders) are not processing two, but one paragraph at a time. Therefore, I would argue that our model is not particularly suited towards producing meaningful embeddings.

Best,
Dennis