question about network training
linhanxiao opened this issue · 3 comments
when training network,the network input is sample.input={{targetIn,targetPosIn},sourceIn}
, crit:forward(net.output,sample.target),but {targetIn,targetPosIn} and sample.target essentially is same thing ,so the input of network contain the information about the target of sample,I wonder why the input of network is that?
yes, the training process requires the target information of the sample, as it uses the previous token to predict the current token, and use the current token to predict the next token.
(credit: https://www.tensorflow.org/tutorials/seq2seq)
In the above image, <go>, W,X,Y,Z at the bottom are the input for the decoder, and W,X,Y,Z,EOS on the top are the expected output for the decoder. When W is fed into the model, the decoder is expected to produce X, and so on.. That's why most seq2seq model have two target inputs(with one time-step offset), one for decoder input, one for loss evaluation
thank you very much