There may be an error in the models.py file

Question

There may be an error in the models.py file

yurkoff-mv opened this issue 2 years ago · 6 comments

Hello.
In the models.py file, lines 75-83 take only the 'source' part of the input. I assume that in the first case (line 75) the 'z1_masked_input' part should be taken, in the second case (line 80) the 'z2_masked_input' part should be taken. Is it so?

Answer 1 · 2022-12-08T12:20:27.000Z

The key ingredient to get this to work with identical positive pairs is through the use of independently sampled dropout masks for $x_i$ and $x^+_i$. Therefor this code works fine.
The generator receives z1_masked_input and z2_masked_input as an input.

Answer 2 · 2022-12-08T12:46:07.000Z

It turns out that the encoder is in the train state?
How are negative sentence samples obtained?
And yet, have you tried to freeze all layers of the BERT/RoBERTa model except for the last one and train only it?

Answer 3 · 2022-12-08T13:01:04.000Z

It turns out that the encoder is in the train state?
-> I didn't understand your question.
How are negative sentence samples obtained?
-> Negative samples are obtained by in-batch negative sampling method.
And yet, have you tried to freeze all layers of the BERT/RoBERTa model except for the last one and train only it?
-> Didn't try.

Answer 4 · 2022-12-08T13:59:36.000Z

A torch model can be in eval and train states. In the train state, Dropout is enabled.
Now I understand.
Thanks for your replies.

Answer 5 · 2022-12-12T05:17:09.000Z

In the /model/diffcse/processor.py file, lines 129 and 161 set model to training or evaluation states.

Answer 6 · 2022-12-12T06:02:43.000Z

Yes, now I have seen. Thank you.