Anybody observe slow training speed for mamba compared to transformer model?

Question

vgthengane opened this issue 9 months ago · 2 comments

Answer 1 · 2024-06-25T20:33:19.000Z

yes, I just made the comparison. The speed of the Mamba block is truly slower than Transformer, under the same input dimension

Answer 2 · 2024-06-25T20:34:36.000Z

for both model training and inference. I am not looking into the details of the Mamba block, maybe I missed something in the code....