/Griffin

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Primary LanguagePython

Issues