/mamba-recall

Experiments with Mamba SSM models

Primary LanguageHTML

Experiments with Mamba State Space Models

Experiments with the Mamba model from Dao and Gu using their implementation

First experiment is to train on the synthetic data induction heads task, and visualize the $\Delta(u_l)$ values in both layers.