多头注意力
Closed this issue · 2 comments
aaaaaaui commented
作者您好,您的注意力机制在使用时很方便,但是多头注意力怎么运行不起来,是这样就可以吗
self.multihead_attn = MultiHeadedAttention(4, self.hidden_size, 0.2).cuda()
lizhi317 commented
Hello, the author, your attention mechanism is very convenient to use, but how can the multi attention work? Is this OKself.multihead_attn = MultiHeadedAttention(4, self.hidden_size, 0.2).cuda()
您这个跑起来了么
johnny12150 commented
Hello, the author, your attention mechanism is very convenient to use, but how can the multi attention work? Is this OKself.multihead_attn = MultiHeadedAttention(4, self.hidden_size, 0.2).cuda()
您这个跑起来了么
This bug should have been fixed now.
For more detail, you can check out this commit.