/param-share-transformer

PyTorch implementation of Lessons on Parameter Sharing across Layers in Transformers

Primary LanguagePythonMIT LicenseMIT

Stargazers