redotvideo/mamba-chat
Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
PythonApache-2.0
Issues
- 0
- 1
fail to install requirements.txt on MAC
#37 opened by taozhiyuai - 4
Error in importing from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel in GoogleCollab
#34 opened by ankitsrivastava637 - 5
Issue while installing requirements.txt
#3 opened by wereretot - 1
MambaConfig' object has no attribute 'to_dict'
#28 opened by sooko - 0
finetuning error
#36 opened by khfs - 0
- 1
- 2
How to use the model after training it ?
#26 opened by kishore-FDI - 0
Why choose zephyr as tokenizer?
#32 opened by rangehow - 0
I downloaded the mamba-790m file from Hugging Face to my local machine for loading and training. However, I encountered an error during the loading process, like that "Missing key(s) in state_dict: "backbone.layers.0.mixer.A_b_log""
#30 opened by zxsdd9 - 0
sentencepiece version
#31 opened by nuochenpku - 0
- 1
- 0
Interesting chat example
#23 opened by protima-banerjee - 0
MoE https://arxiv.org/abs/2401.04081
#22 opened by Eupham - 1
ImportError: /usr/local/lib/python3.10/dist-packages/causal_conv1d_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi
#21 opened by venkat-p-r - 3
Cant use trained model
#16 opened by yy9996 - 6
Error during training
#15 opened by Eupham - 0
Any plan or interest to use OpenChat algorithm (https://github.com/imoneoi/openchat) to train your chat?
#19 opened by houghtonweihu - 0
Eval on benchmark?
#18 opened by tic-top - 5
How could I run this on windows 10?
#10 opened by KevinRyu - 1
Error When Inferencing
#17 opened by SeifMosaad - 4
Finetune on 3090 but loss equal to zero
#11 opened by Yingyue-L - 0
Memory requirements for training
#14 opened by pkpro - 7
- 3
Demo
#4 opened by fakerybakery - 1
TypeError: MixerModel.__init__() got an unexpected keyword argument 'bos_token_id'
#8 opened by xiechengmude - 1
setting device for training
#5 opened by ekg - 2
Bug in train_mamba.py line 53
#2 opened by vmajor