There may be an error in the models.py file
yurkoff-mv opened this issue · 6 comments
Hello.
In the models.py file, lines 75-83
take only the 'source' part of the input. I assume that in the first case (line 75
) the 'z1_masked_input'
part should be taken, in the second case (line 80
) the 'z2_masked_input'
part should be taken. Is it so?
The key ingredient to get this to work with identical positive pairs is through the use of independently sampled dropout masks for
The generator receives z1_masked_input
and z2_masked_input
as an input.
It turns out that the encoder is in the train state?
How are negative sentence samples obtained?
And yet, have you tried to freeze all layers of the BERT/RoBERTa model except for the last one and train only it?
- It turns out that the encoder is in the train state?
-> I didn't understand your question. - How are negative sentence samples obtained?
-> Negative samples are obtained by in-batch negative sampling method. - And yet, have you tried to freeze all layers of the BERT/RoBERTa model except for the last one and train only it?
-> Didn't try.
- A torch model can be in eval and train states. In the train state, Dropout is enabled.
- Now I understand.
Thanks for your replies.
In the /model/diffcse/processor.py
file, lines 129
and 161
set model to training or evaluation states.
Yes, now I have seen. Thank you.