/elastic-sparse-attention

Elastic Sparse Attention for Long-Sequence Modeling

Primary LanguagePythonApache License 2.0Apache-2.0

Elastic Sparse Attention for Long-Sequence Modeling

DOI Discord Discord

This is the official implementation of Elastic Sparse Attention for Long-Sequence Modeling.

License Agreement

All our open-weight models are licensed under Apache 2.0.

Citation

If you find our work helpful, feel free to give us a cite.

@manual{wu_2025_15871900,
  title        = {Elastic Sparse Attention for Long-Sequence
                   Modeling
                  },
  author       = {Wu, Hecong},
  month        = jul,
  year         = 2025,
  doi          = {10.5281/zenodo.15871900},
  url          = {https://doi.org/10.5281/zenodo.15871900},
}