/ring-attention

ring-attention experiments

Primary LanguagePythonApache License 2.0Apache-2.0

ring-attention

Ring Attention leverages blockwise computation of self-attention on multiple GPUs and enables training and inference of sequences that would be too long for a single devices.

This repository contains notebooks, experiments and a collection of links to papers and other material related to Ring Attention.

Reserach / Material

Notebooks

Development References

How to contribute

Contact us on the GPU MODE discord server: https://discord.gg/gpumode, PRs are welcome (please create an issue first).