A bug when I add encoder layers
RuifMaxx opened this issue · 2 comments
Thanks very much for your code. However, there are some difference between your code and the tutorial of PyTorch: SEQUENCE-TO-SEQUENCE MODELING WITH NN.TRANSFORMER AND TORCHTEXT in class TransAm
According to https://www.zhihu.com/question/67209417/answer/1264503855, with the addition of self.
to encoder_ Layers
This leads to self. encoder_ Layers
are counted as parameters of module, but only self. Transformer
is used in network operation_ The nlayers copied from encoder are the parameters in nn.transformerencoderlayer
That is to say, self. Encoder_ Layers
do not participate in model operation, so there is no gradient in backward, which leads to training errors.
Well thank you @RuifMaxx I did not think that torch would use this layer as long as I am not referencing it.
Well thanks for your reply