Monoxide-Chen/uncertainty_retrieval

is it unfair to use pre-trained roberta model?

Closed this issue · 1 comments

most baseline methods in table 1 use LSTM without pre-trained parameters as backbone,
i think it's better to report the performance on LSTM to avoid the extra benefit that roberta model brings

Thanks a lot for your attention to our paper and understanding.

  1. We want to apply a strong baseline model to show the improvement of our method.

  2. Yes. Actually, we can not achieve the same baseline result provided by CosMo code. Therefore, we use the pretrained Roberta to obtain a competitive baseline.

  3. We do not want to overclaim the sota result but focus on our relative improvement on such a strong baseline.