Why use tie_encoder_decoder_embeddings instead of tie_decoder_embeddings in line 131 of transformer.py?
byccln opened this issue · 3 comments
I'm confused here, and i think use tie_decoder_embeddings can be better.
I think you are right, this:
https://github.com/EdinburghNLP/nematus/blob/master/nematus/transformer.py#L131
is a bug. If a user would like to tie decoder embeddings and softmax output matrix with --tie_decoder_embeddings
, this does not happen. Instead, those two matrices are also tied if --tie_encoder_decoder_embeddings
.
Would you like to submit a PR?
thanks; this has now been fixed to follow the documentation.
(This bug affects people who trained models with 2-way-tying. People who trained models with 3-way-tying of embeddings, as is also implemented in the sample training scripts at https://github.com/EdinburghNLP/wmt17-transformer-scripts ), will be unaffected).