/flash-attention

Flash Attention for Dragon

Primary LanguageC++BSD 2-Clause "Simplified" LicenseBSD-2-Clause

FlashAttention for Dragon

This repository extends FlashAttention and other Transformer operators for Dragon.

Following the design principle of Dragon, this repository devotes to unify the modeling of Transformers for NVIDIA GPUs, Apple Silicon processors, Cambricon MLUs and more AI accelerators.

Installation

Installing Package

Clone this repository to local disk and install:

cd flash-attention && mkdir build
cd build && cmake .. && make install -j $(nproc)
pip install ..

License

BSD 2-Clause license

Acknowledgement

We thank the repositories: FlashAttention.