A question about Sec 3.2 in paper

Question

A question about Sec 3.2 in paper

Closed this issue 3 years ago · 3 comments

szxSpark commented 5 years ago

"where v_s and v_s' are two different representations of sentence s via two differently parameter-ized BERT encoders"

In your code, i just find only one BERT encoder. Cay you teach me how to understand this sentcence? Thank you.

Answer 1 · 2020-08-14T09:27:04.000Z

@szxSpark Have you gotten the meaning of the objective? maximize or minimize? Thank you

Answer 2 · 2020-08-16T15:15:34.000Z

Instead of using one BERT model to encode s and just finetune the only BERT model throughout the training, we want to emphasize here that we use two separate BERT models (two copy of models without weight sharing) to encode s as v_s and v_s^prime, and fine-tuning two models during the training.

Answer 3 · 2021-05-19T07:20:19.000Z

@mswellhao 想问一下如果是同时finetune 两个bert么，如果是的话，之后哪一个作为你的微调后的模型呢?
我理解的是不是两个bert但是只更新其中一个的参数？