Rectified Linear Attention (TensorFlow implementation) from the paper Sparse Attention with Linear Units
- Pre-Training file and Fine-Tune file
- HF repo
Warning
This repository is under developemnt, but please feel free to explore and provide any feedback or suggestions you may have. 🚧
@article{Zhang2021SparseAwLU,
title = {Sparse Attention with Linear Units},
author = {Biao Zhang and Ivan Titov and Rico Sennrich},
journal = {ArXiv},
year = {2021},
volume = {abs/2104.07012}
}