fine-tune阶段词表及词嵌入矩阵与预训练模型不一致的问题

Question

fine-tune阶段词表及词嵌入矩阵与预训练模型不一致的问题

Closed this issue 4 years ago · 1 comments

你好，我按照README里面finetune的步骤，用自己的语料进行微调，报了如下错误：
RuntimeError: Error(s) in loading state_dict for TransformerModel:
size mismatch for encoder.embed_tokens.weight: copying a param with shape torch.Size([64871, 1024]) from checkpoint, the shape in current model is torch.Size([4130, 1024]).
size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([64871, 1024]) from checkpoint, the shape in current model is torch.Size([4130, 1024]).

看起来是微调时生成的词表和预训练阶段不一致造成的，但是README里没有提供如何令微调语料的词表和预训练模型（64871）进行对齐的流程，请问是我遗漏了哪个步骤吗？

Answer 1 · 2021-01-29T11:25:26.000Z

看到提供了词表了，thx