F.multi_head_attention_forward missing parameter 'average_attn_weights=True'?
Opened this issue · 1 comments
yihanxxu commented
TypeError: multi_head_attention_forward got an unexpected keyword argument 'average_attn_weights'
Even-ok commented
If your torch version is less than 2.0, you can simply remove the 'average_attn_weights=True'
option, as the averaging function is already implemented in the 'multi_head_attention_forward
'.
def multi_head_attention_forward(
.....
if need_weights:
# average attention weights over heads
attn_output_weights = attn_output_weights.view(bsz, num_heads, tgt_len, src_len)
return attn_output, attn_output_weights.sum(dim=1) / num_heads
else:
return attn_output, None