Tutorial 4: Decoder - the calculation of prediction

Question

Tutorial 4: Decoder - the calculation of prediction

actforjason opened this issue 4 years ago · 2 comments

Why use torch.cat((output, weighted, embedded), dim=1)？
Usually，Isn't just using the output enough？

        embedded = embedded.squeeze(0)
        output = output.squeeze(0)
        weighted = weighted.squeeze(0)
        
        prediction = self.fc_out(torch.cat((output, weighted, embedded), dim=1))

Answer 1 · 2021-03-10T14:15:21.000Z

We could just use output, but the notebook is replicating this paper which calculates the prediction using: the decoder hidden state (output), the attention weighted context (weighted) and the current input word (embedded) - see appendix section 2.2.

Maybe output is enough in this case. Feel free to try it and let me know if the results are any different.

Answer 2 · 2021-03-10T15:48:17.000Z

We could just use output, but the notebook is replicating this paper which calculates the prediction using: the decoder hidden state (output), the attention weighted context (weighted) and the current input word (embedded) - see appendix section 2.2.

Maybe output is enough in this case. Feel free to try it and let me know if the results are any different.

thank you, I got it.