Clarification regarding decoder_input and label

Question

Closed this issue 6 months ago · 1 comments

They way I see it now, we feed the decoder input entirely into the decoder. But the decoder input is formed form the target text.

Shouldn't the decoder only get a start-of-sentence token plus padding? And then get it's own output in each iteration?

Answer 1 · 2024-04-12T11:48:21.000Z

Oh of course, we use a mask. Handing over the entire text in one go just speeds up training. Closing issue.