johnny12150/GC-SAN

多头注意力

Closed this issue · 2 comments

作者您好,您的注意力机制在使用时很方便,但是多头注意力怎么运行不起来,是这样就可以吗
self.multihead_attn = MultiHeadedAttention(4, self.hidden_size, 0.2).cuda()

Hello, the author, your attention mechanism is very convenient to use, but how can the multi attention work? Is this OKself.multihead_attn = MultiHeadedAttention(4, self.hidden_size, 0.2).cuda()

您这个跑起来了么

Hello, the author, your attention mechanism is very convenient to use, but how can the multi attention work? Is this OKself.multihead_attn = MultiHeadedAttention(4, self.hidden_size, 0.2).cuda()

您这个跑起来了么

This bug should have been fixed now.
For more detail, you can check out this commit.