Mamba2
Joelx opened this issue · 2 comments
Joelx commented
First, thanks for this great contribution and the well written paper!
As Mamba2 also was released very recently, do you have any thoughts on the potential integration or impact of Mamba2 on the Samba architecture?
Would be much appreciated.
richardburleigh commented
I tested this a while ago and it was basically swappable. From memory I just replaced:
from .mamba_simple import Mamba
here
With
from mamba_ssm import Mamba2 as Mamba
I'm currently training a 270M Samba model, and will try again with Mamba2 to compare the results.