is it unfair to use pre-trained roberta model?
Closed this issue · 1 comments
Norwa9 commented
most baseline methods in table 1 use LSTM without pre-trained parameters as backbone,
i think it's better to report the performance on LSTM to avoid the extra benefit that roberta model brings
Monoxide-Chen commented
Thanks a lot for your attention to our paper and understanding.
-
We want to apply a strong baseline model to show the improvement of our method.
-
Yes. Actually, we can not achieve the same baseline result provided by CosMo code. Therefore, we use the pretrained Roberta to obtain a competitive baseline.
-
We do not want to overclaim the sota result but focus on our relative improvement on such a strong baseline.