A question about Sec 3.2 in paper
Closed this issue · 3 comments
szxSpark commented
zhanghaiting001 commented
@szxSpark Have you gotten the meaning of the objective? maximize or minimize? Thank you
mswellhao commented
Instead of using one BERT model to encode s and just finetune the only BERT model throughout the training, we want to emphasize here that we use two separate BERT models (two copy of models without weight sharing) to encode s as v_s and v_s^prime, and fine-tuning two models during the training.
WhyCube commented
@mswellhao 想问一下如果是同时finetune 两个bert么,如果是的话,之后哪一个作为 你的微调后的模型呢?
我理解的是不是 两个bert但是只更新其中一个的参数?