Question about the metric reported in the paper?

Question

Question about the metric reported in the paper?

dsj96 opened this issue 2 years ago · 0 comments

Question about the metric reported in the paper?.
HELLO! I am a new NLPer. I am confused about the pipline(pretrain->fineturn->test) of pre-training large language models.

I would like to know which stage of the model was used for unlabeled dataset (e.g., c4), labeled dataset (e.g., glue, superGLUE WMT), respectively?
In paper, section 2.4, I find that
We instead allow for separately fine-tuning the model on each individual task and use short task prefixes instead of an explicit question-answer format.
As shown in Table 1 of the paper, dose T5 model pre-trained on dataset C4, then fine-tuned on GLUE, CNNDM, SQuAD, SGLUE and WMT dataset, respectively? Finally, reported the score in Table 1.
Other Large Language Models, like GPT, GPT2, have these models been fine-tuned on labeled dataset before reporting the scores?

Thank you!