为什么增加了多头注意力的数量不会增加复杂度和参数量？

Question

为什么增加了多头注意力的数量不会增加复杂度和参数量？

Opened this issue a year ago · 0 comments

在TransNet.py文件中的model = Transformer(d_model=d_model, num_encoder_layers=2, num_decoder_layers=2, nhead=2, reduction =reduction, dropout= 0.)代码中，改变nhead的数量为什么不会改变模型的复杂度和参数量，希望得到您的回复。