Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
Primary LanguagePythonMIT LicenseMIT