EdinburghNLP/nematus

Why use tie_encoder_decoder_embeddings instead of tie_decoder_embeddings in line 131 of transformer.py?

byccln opened this issue · 3 comments

I'm confused here, and i think use tie_decoder_embeddings can be better.

I think you are right, this:

https://github.com/EdinburghNLP/nematus/blob/master/nematus/transformer.py#L131

is a bug. If a user would like to tie decoder embeddings and softmax output matrix with --tie_decoder_embeddings, this does not happen. Instead, those two matrices are also tied if --tie_encoder_decoder_embeddings.

Would you like to submit a PR?

thanks; this has now been fixed to follow the documentation.

(This bug affects people who trained models with 2-way-tying. People who trained models with 3-way-tying of embeddings, as is also implemented in the sample training scripts at https://github.com/EdinburghNLP/wmt17-transformer-scripts ), will be unaffected).