Aligning in speaker
zwx8981 opened this issue · 2 comments
R2R-EnvDrop/r2r_src/speaker.py
Line 243 in 4c11585
Hi, I am confused about the aligning operation (by your comments). It seems that you ignore the last element of the predicted logits (the logits of 'EOS' or 'PAD') and the 'BOS' of the target when you compute the loss during training the speaker (which, in my view, make the logits and target unalgined...). Can you explain how does this operation make the logits and target aligned?
Hmmm.
Let me explain it with an example. Suppose the sequence is "hello world", the input and desired output in teacher_forcing would be:
Time: 1 2 3 4
Input: <bos> hello world <eos>
Desired Output: hello world <eos> ???
And the logit outputed by the model is:
Time: 1 2 3 4
Input: <bos> hello world <eos>
Output Logit: L_0 L_1 L_2 L_3
Thus the aligned logit and target would be
logit[:-1]: L_0 L_1 L_2
Input[1:]: hello world <eos>
And it is what the code does.
Hope this answers your question!
Oh, that's very clear! Thank you so much!