LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence
Primary LanguagePython
No issues in this repository yet.