My question
Messiz opened this issue · 2 comments
Messiz commented
if trg_emb_prj_weight_sharing:
# Share the weight between target word embedding & last dense layer
self.trg_word_prj.weight = self.decoder.trg_word_emb.weight
if emb_src_trg_weight_sharing:
self.encoder.src_word_emb.weight = self.decoder.trg_word_emb.weight
The code above want to realize weight share, but I'm confused that the embed layer and the linear layer have different shape of weight. How can this assignment work?
yingying123321 commented
Messiz commented
Thank you for your answer, but I figured it out a few days after that by myself. Thanks anyway!😂