redotvideo/mamba-chat

Memory requirements for training

Opened this issue · 0 comments

pkpro commented

I was able to run 2.8b model for inference and it uses about 6G of VRAM. In your readme there is 24G requirements for training. Is model uses much more memory during training (32-bit?) or is it because of the space required for input batches?