google-deepmind/deepmind-research

Layer normalization

goonm opened this issue · 0 comments

goonm commented

Thank you for sharing your good research findings.
I have a question about layer normalization in your network.
Unlike conventional cnn or transformer, one layer normalization is applied after going through several gcn.
Is there a special reason for applying it at the end of the encoder instead of applying it after each gcn layer?