Issues
- 0
- 2
How to prepare the training data
#42 opened by ycsun1972 - 2
Update Anthropic Client
#34 opened by krrishdholakia - 1
Inference is very slow on long text input
#39 opened by Colafei0406 - 1
- 3
How was the 18k dataset prepared?
#5 opened by musabgultekin - 2
dummy conversations seem to be short
#40 opened by Arist12 - 4
OOM issue
#28 opened by WeixuanXiong - 1
- 0
- 2
- 1
flash attention rename
#33 opened by Arist12 - 0
Do you support Llama-2-13b model data?
#32 opened by brewswang - 0
train ValueError
#31 opened by brewswang - 4
- 0
Output token limit
#29 opened by MoppyDu97 - 3
Maybe a bug in the preprocess?
#26 opened by Richar-Du - 2
About the print message
#25 opened by lucasjinreal - 1
About the learning rate
#19 opened by lucasjinreal - 1
Xformers Monkey Patch Compatibility
#21 opened by fahadh4ilyas - 1
Longchat inference configuration
#23 opened by SeekWrldTea - 9
longchat-13b-16k chat not work
#14 opened by ahkimkoo - 8
- 5
OutOfMemoryError: CUDA out of memory.
#9 opened by brewswang - 0
- 3
Web GUI for longchat
#12 opened by VVNMA - 2
The purpose of pretrain script?
#17 opened by fahadh4ilyas - 9
Monkey Patch Xformers use `past_key_value` but `use_cache` can't be `True`?
#15 opened by fahadh4ilyas - 0
Support for other model like Baichuan
#20 opened by lucasjinreal - 8
why not reuse fschat code?
#16 opened by lucasjinreal - 1
Will it support qlora?
#18 opened by lw3259111 - 1
- 1
Add scripts to generate more testcases
#6 opened by DachengLi1 - 7
How to use 3090 to train 16k model?
#4 opened by aresa7796 - 1
Multi-node training?
#11 opened by XueFuzhao - 4
Load the model for inference?
#10 opened by fahadh4ilyas - 5
unsupervised pre-training on the model
#2 opened by wqn1