/Griffin

Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"

Primary LanguagePythonMIT LicenseMIT