
A bug when I add encoder layers

RuifMaxx opened this issue · 2 comments

Thanks very much for your code. However, there are some difference between your code and the tutorial of PyTorch: SEQUENCE-TO-SEQUENCE MODELING WITH NN.TRANSFORMER AND TORCHTEXT in class TransAm

According to https://www.zhihu.com/question/67209417/answer/1264503855, with the addition of self. to encoder_ Layers This leads to self. encoder_ Layers are counted as parameters of module, but only self. Transformeris used in network operation_ The nlayers copied from encoder are the parameters in nn.transformerencoderlayer That is to say, self. Encoder_ Layers do not participate in model operation, so there is no gradient in backward, which leads to training errors.

Well thank you @RuifMaxx I did not think that torch would use this layer as long as I am not referencing it.

Well thanks for your reply