Why use tie_encoder_decoder_embeddings instead of tie_decoder_embeddings in line 131 of transformer.py?

Question

Why use tie_encoder_decoder_embeddings instead of tie_decoder_embeddings in line 131 of transformer.py?

byccln opened this issue 4 years ago · 3 comments

I'm confused here, and i think use tie_decoder_embeddings can be better.

Answer 1 · 2020-11-20T12:56:42.000Z

I think you are right, this:

https://github.com/EdinburghNLP/nematus/blob/master/nematus/transformer.py#L131

is a bug. If a user would like to tie decoder embeddings and softmax output matrix with --tie_decoder_embeddings, this does not happen. Instead, those two matrices are also tied if --tie_encoder_decoder_embeddings.

Would you like to submit a PR?

Answer 2 · 2020-12-11T10:49:40.000Z

thanks; this has now been fixed to follow the documentation.

Answer 3 · 2020-12-11T10:54:21.000Z

(This bug affects people who trained models with 2-way-tying. People who trained models with 3-way-tying of embeddings, as is also implemented in the sample training scripts at https://github.com/EdinburghNLP/wmt17-transformer-scripts ), will be unaffected).