/layer_norm_expressivity_role

Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023)

Primary LanguagePython

Watchers