多头注意力

Question

多头注意力

Closed this issue 2 years ago · 2 comments

作者您好，您的注意力机制在使用时很方便，但是多头注意力怎么运行不起来，是这样就可以吗
self.multihead_attn = MultiHeadedAttention(4, self.hidden_size, 0.2).cuda()

Answer 1 · 2022-05-19T06:33:04.000Z

Hello, the author, your attention mechanism is very convenient to use, but how can the multi attention work? Is this OKself.multihead_attn = MultiHeadedAttention(4, self.hidden_size, 0.2).cuda()

您这个跑起来了么

Answer 2 · 2022-07-24T09:47:49.000Z

Hello, the author, your attention mechanism is very convenient to use, but how can the multi attention work? Is this OKself.multihead_attn = MultiHeadedAttention(4, self.hidden_size, 0.2).cuda()

您这个跑起来了么

This bug should have been fixed now.
For more detail, you can check out this commit.