If i want to use the sentence pair model to get the similarity between them?

Question

If i want to use the sentence pair model to get the similarity between them?

BruceLee66 opened this issue 5 years ago · 6 comments

Now i have 1000000 sentence pairs,which throw out the same meaning.when i use those data to train the sentence model,i saved the model static pkl. But i use the trained model to eval new sentence pair,almost all of them get the score(1.0) .
what should i do?can you give me some advice!

Answer 1 · 2019-07-01T16:56:18.000Z

All positive training examples? no negative?

Answer 2 · 2019-07-02T15:29:38.000Z

yes. all sentences pairs are similar. when I use this trained model to predict other sentence pair which is different from each other.its score still be very closely to 1.I really confused.

Answer 3 · 2019-07-02T16:43:48.000Z

You need negative samples for training, otherwise the model will biased towards positive case.

Answer 4 · 2019-07-03T06:39:12.000Z

I decide to select negative examples randomly. The number of negative samples is 5 times that of the positive example.Would that be OK?

Answer 5 · 2019-07-03T16:27:50.000Z

1:1 should be enough. Importantly, you need to make sure the negative examples are meaningful: a pair shared many n-gram words but non-paraphrase.

Answer 6 · 2019-07-05T03:36:44.000Z

okay!I will try the ratio of 1：1，thank you very much.