jadore801120/attention-is-all-you-need-pytorch

MultiHeadAttention input shape

Superklez opened this issue · 0 comments

Is the input shape of MultiHeadAttention [batch_size, sequence_length, embedding_size]? Or is it the same as nn.MultiheadAttention where the input shape must be [sequence_length, batch_size, embedding_size]