nlpyang/PreSumm

Additional position embedding increase parameters of transformer?

ken-ando opened this issue · 0 comments

This work introduces additional positional embedding for the number of tokens more than 512.

if(args.max_pos>512):
my_pos_embeddings = nn.Embedding(args.max_pos, self.bert.model.config.hidden_size)
my_pos_embeddings.weight.data[:512] = self.bert.model.embeddings.position_embeddings.weight.data
my_pos_embeddings.weight.data[512:] = self.bert.model.embeddings.position_embeddings.weight.data[-1][None,:].repeat(args.max_pos-512,1)
self.bert.model.embeddings.position_embeddings = my_pos_embeddings

But, this code doesn't seem to extend transformer.
I think if the subsequent encoder does not have additional parameters, the shape will not match.

So, I guess the transformers automatically add the parameters of transformer, is this understanding correct?