redotvideo/mamba-chat

Mamba-Chat: A chat LLM based on the state-space model architecture 🐍

PythonApache-2.0

Issues

Can the train_mamba.py be used to pretrain the model?
#38 opened 4 days ago by ReaganGen
0
fail to install requirements.txt on MAC
#37 opened a month ago by taozhiyuai
1
Error in importing from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel in GoogleCollab
#34 opened 7 months ago by ankitsrivastava637
4
Issue while installing requirements.txt
#3 opened 10 months ago by wereretot
5
MambaConfig' object has no attribute 'to_dict'
#28 opened 8 months ago by sooko
1
finetuning error
#36 opened 2 months ago by khfs
0
Question about how mamba chat training is done
#35 opened 4 months ago by aravindkoti
0
Feature 'cvt with .bf16' requires .target sm_80 or higher Error
#24 opened 8 months ago by venkat-p-r
1
How to use the model after training it ?
#26 opened 8 months ago by kishore-FDI
2
Why choose zephyr as tokenizer?
#32 opened 7 months ago by rangehow
0
I downloaded the mamba-790m file from Hugging Face to my local machine for loading and training. However, I encountered an error during the loading process, like that "Missing key(s) in state_dict: "backbone.layers.0.mixer.A_b_log""
#30 opened 7 months ago by zxsdd9
0
sentencepiece version
#31 opened 7 months ago by nuochenpku
0
Is there Padding Mask when training the model?
#29 opened 8 months ago by ZetangForward
0
Is the provided chat model trained on ultrachat_small.jsonl?
#25 opened 8 months ago by shansiliu95
1
Interesting chat example
#23 opened 8 months ago by protima-banerjee
0
MoE https://arxiv.org/abs/2401.04081
#22 opened 8 months ago by Eupham
0
ImportError: /usr/local/lib/python3.10/dist-packages/causal_conv1d_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi
#21 opened 9 months ago by venkat-p-r
1
Cant use trained model
#16 opened 10 months ago by yy9996
3
Error during training
#15 opened 9 months ago by Eupham
6
Any plan or interest to use OpenChat algorithm (https://github.com/imoneoi/openchat) to train your chat?
#19 opened 9 months ago by houghtonweihu
0
Eval on benchmark?
#18 opened 9 months ago by tic-top
0
How could I run this on windows 10?
#10 opened 10 months ago by KevinRyu
5
Error When Inferencing
#17 opened 10 months ago by SeifMosaad
1
Finetune on 3090 but loss equal to zero
#11 opened 10 months ago by Yingyue-L
4
Memory requirements for training
#14 opened 10 months ago by pkpro
0
Colab notebook has error, numpy array used instead of torch
#9 opened 10 months ago by microcoder-py
7
Demo
#4 opened 10 months ago by fakerybakery
3
TypeError: MixerModel.__init__() got an unexpected keyword argument 'bos_token_id'
#8 opened 10 months ago by xiechengmude
1
setting device for training
#5 opened 10 months ago by ekg
1
Bug in train_mamba.py line 53
#2 opened 10 months ago by vmajor
2