An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
Primary LanguagePythonMIT LicenseMIT