/awesome-ssm

A list for SSMs and related works.

MIT LicenseMIT

SSMs and related works list

Awesome forks stars License

About

A list for SSMs and related works.

List for SSMs

Number SSM Paper Code Conference or Journal URL
1 HiPPO HiPPO: Recurrent Memory with Optimal Polynomial Projections https://github.com/state-spaces/s4 NeurIPS 2020 https://proceedings.neurips.cc/paper/2020/hash/102f0bb6efb3a6128a3c750dd16729be-Abstract.html
2 LSSL Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers https://github.com/state-spaces/s4 NeurIPS 2021 https://openreview.net/forum?id=yWd42CWN3c
3 S4 Efficiently Modeling Long Sequences with Structured State Spaces https://github.com/state-spaces/s4 ICLR 2022 https://openreview.net/forum?id=uYLFoz1vlAC
4 DSS Diagonal State Spaces are as Effective as Structured State Spaces https://github.com/ag1988/dss NeurIPS 2022 https://openreview.net/forum?id=RjS0j6tsSrf
5 S4D On the Parameterization and Initialization of Diagonal State Space Models https://github.com/state-spaces/s4 NeurIPS 2022 https://openreview.net/forum?id=yJE7iQSAep
6 Generalized HiPPO How to Train your HIPPO: State Space Models with Generalized Orthogonal Basis Projections https://github.com/state-spaces/s4 ICLR 2023 https://openreview.net/forum?id=klK17OQ3KB
7 GSS Long Range Language Modeling via Gated State Spaces ICLR 2023 https://openreview.net/forum?id=5MkYIYCbva
8 Liquid S4 Liquid Structural State-Space Models https://github.com/raminmh/liquid-s4 ICLR 2023 https://openreview.net/forum?id=g4OTKRKfS7R
9 S5 Simplified State Space Layers for Sequence Modeling https://github.com/lindermanlab/S5 ICLR 2023 https://openreview.net/forum?id=Ai8Hw3AXqks
10 H3 Hungry Hungry Hippos: Towards Language Modeling with State Space Models https://github.com/HazyResearch/H3 ICLR 2023 https://openreview.net/forum?id=COZDy0WYGg
11 S4-PTD and S5-PTD Robustifying State-space Models for Long Sequences via Approximate Diagonalization ICLR 2024 https://openreview.net/forum?id=DjeQ39QoLQ
12 S6 Mamba: Linear-Time Sequence Modeling with Selective State Spaces https://github.com/state-spaces/mamba https://arxiv.org/abs/2312.00752
13 STU Spectral State Space Models https://github.com/catid/spectral_ssm https://arxiv.org/abs/2312.06837
14 Mamba 2 Transformers are SSMs: Generalized Models and Efficient Algorithms with Structured State Space Duality https://github.com/state-spaces/mamba ICML 2024 https://arxiv.org/abs/2405.21060
15 RTF State-Free Inference of State-Space Models: The Transfer Function Approach https://github.com/ruke1ire/RTF ICML 2024 https://openreview.net/forum?id=DwwI9L67B5

List for Linear RNNs (LRNNs)

Number LRNN Paper Code Conference or Journal URL
1 CKConv CKConv: Continuous Kernel Convolution For Sequential Data https://github.com/dwromero/ckconv ICLR 2021 https://openreview.net/forum?id=8FhxBtXSl0
2 FlexConv FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes https://github.com/rjbruin/flexconv ICLR 2022 https://openreview.net/forum?id=3jooF27-0Wy
3 DLR Simplifying and Understanding State Space Models with Diagonal Linear RNNs https://github.com/ag1988/dlr https://arxiv.org/abs/2212.00768
4 CCNN Modelling Long Range Dependencies in $N$D: From Task-Specific to a General Purpose CNN https://github.com/david-knigge/ccnn ICLR 2023 https://openreview.net/forum?id=ZW5aK4yCRqU
5 SGConv What Makes Convolutional Models Great on Long Sequence Modeling? https://github.com/ctlllll/SGConv ICLR 2023 https://openreview.net/forum?id=TGJSPbRpJX-
6 Mega Mega: Moving Average Equipped Gated Attention https://github.com/facebookresearch/mega ICLR 2023 https://openreview.net/forum?id=qNLe3iq2El
7 TNN Toeplitz Neural Network for Sequence Modeling https://github.com/Doraemonzzz/tnn-pytorch ICLR 2023 https://openreview.net/forum?id=IxmWsm4xrua
8 Hyena Hyena Hierarchy: Towards Larger Convolutional Language Models https://github.com/hazyresearch/safari ICML 2023 https://proceedings.mlr.press/v202/poli23a.html
9 MultiresNet Sequence Modeling with Multiresolution Convolutional Memory https://github.com/thjashin/multires-conv ICML 2023 https://proceedings.mlr.press/v202/shi23f.html
10 LRU Resurrecting Recurrent Neural Networks for Long Sequences ICML 2023 https://proceedings.mlr.press/v202/orvieto23a.html
11 RWKV v4 (Dove) RWKV: Reinventing RNNs for the Transformer Era https://github.com/BlinkDL/RWKV-LM EMNLP 2023 https://aclanthology.org/2023.findings-emnlp.936/
12 RetNet Retentive Network: A Successor to Transformer for Large Language Models https://github.com/microsoft/torchscale https://arxiv.org/abs/2307.08621
13 MultiHyena Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions NeurIPS 2023 https://openreview.net/forum?id=OWELckerm6
14 Monarch Mixer Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture https://github.com/HazyResearch/m2 NeurIPS 2023 https://openreview.net/forum?id=cB0BImqSS9
15 SeqBoat Sparse Modular Activation for Efficient Sequence Modeling https://github.com/renll/SeqBoat NeurIPS 2023 https://openreview.net/forum?id=TfbzX6I14i
16 HGRN Hierarchically Gated Recurrent Neural Network for Sequence Modeling https://github.com/OpenNLPLab/HGRN NeurIPS 2023 https://openreview.net/forum?id=P1TCHxJwLB
17 GLA Transformer Gated Linear Attention Transformers with Hardware-Efficient Training https://github.com/sustcsonglin/flash-linear-attention https://arxiv.org/abs/2312.06635
18 Orchid Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling https://arxiv.org/abs/2402.18508
19 RWKV v5 (Eagle) and v6 (Finch) Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence https://huggingface.co/RWKV https://arxiv.org/abs/2404.05892
20 HGRN2 HGRN2: Gated Linear RNNs with State Expansion https://github.com/OpenNLPLab/HGRN2 https://arxiv.org/abs/2404.07904

List for Surveys

Number Paper Journal or Conference URL
1 A Unified View of Long-Sequence Models towards Modeling Million-Scale Dependencies https://arxiv.org/abs/2302.06218
2 State Space Model for New-Generation Network Alternative to Transformers: A Survey https://arxiv.org/abs/2404.09516