Question about the SeqSelfAttention.

Question

Question about the SeqSelfAttention.

katekats opened this issue 2 years ago · 0 comments

My question is: For the additive self-attention approach, are word embeddings from other timestamps taken into consideration for calculating the attention weights or only from the current timestamp (meaning word embeddings of the current sentence/input)?