/TUPE

Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.

Primary LanguagePythonMIT LicenseMIT

Watchers