/SimplifiedTransformers

SimplifiedTransformer simplifies transformer block without affecting training. Skip connections, projection parameters, sequential sub-blocks, and normalization layers are removed. Experimental results confirm similar training speed and performance.

Primary LanguagePythonMIT LicenseMIT

Stargazers