MenghaoGuo/PCT

How do you consider about Layer normalization and Batch normalization?

Alobal opened this issue · 0 comments

Hi,

PCT used the Batch normalization, instead of the Layer normalization, used by original Transformer.

I wonder how do you consider about Layer normalization and Batch normalization in PCT?