Why take the first element of a batch after padding RNN output?

Question

Why take the first element of a batch after padding RNN output?

shengyuzhang opened this issue 6 years ago · 2 comments

From my understanding, the code indicates that:

After padding the RNN output "out" to "padded" with batch_first=True, the first dim of "padded" should be batch_size, and then the operation "padded[0]" takes the first element of a batch. This operation is rare and hard to understand. Am I wrong? Could someone help explain the purpose of this code?

Thanks in advance.

Answer 1 · 2019-04-01T08:51:31.000Z

Seemed that pad_packed_sequence returns a tuple rather than a tensor of shape [batch_size, seq_len, embed_size]

Answer 2 · 2019-04-02T04:18:20.000Z

As you correctly pointed out, pad_packed_sequence returns a tuple, where the first elements is the padded sequence.
https://pytorch.org/docs/stable/nn.html#torch.nn.utils.rnn.pad_packed_sequence

This is from an old pytorch tutorial that I cannot find now. The purpose is to handle variable size sequences. Nowadays, it is more common to pad sequences and make them equal length at the time of batch creation. This is true for pytorch official examples:
https://github.com/pytorch/examples/tree/master/word_language_model