Blealtan/RWKV-LM-LoRA

RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

PythonApache-2.0

Issues

如何微调提升特定领域知识
#20 opened a year ago by magicuter
6
AttributeError: type object 'Trainer' has no attribute 'add_argparse_args'
#54 opened 8 months ago by humanpp
1
ninja: no work to do
#55 opened 8 months ago by ByUnal
0
[Feature Request] RWKV-v5 support
#53 opened a year ago by l1006986533
1
报错：AttributeError: 'MMapIndexedDataset' object has no attribute '_bin_buffer_mmap'
#52 opened a year ago by Macaron-Lawrence
0
自动停止
#51 opened a year ago by john1089
0
Low precision LoRA fine tuning
#5 opened a year ago by TehVenomm
5
Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [92,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
#50 opened a year ago by cdg1921
0
PSA: Issue with Multi-GPU & CUDA 12.0
#49 opened a year ago by PicoCreator
0
[dev-infctx] deepspeed 3 / multi-GPU related errors
#34 opened a year ago by PicoCreator
1
Stuck in Multigpus lora finetuning
#48 opened a year ago by PeiyuZ-star
0
模型微调
#47 opened a year ago by TJSL0715
0
ValueError: offset must be non-negative and no greater than buffer length (76025948)
#46 opened a year ago by wuzeyuuu
2
New model World run error, is it support new vocab 'rwkv_vocab_v20230424.txt'?
#43 opened a year ago by Helloworld2345567
1
Cannot allocate memory
#36 opened a year ago by chenhaobupt
2
Train on CPU
#28 opened a year ago by sam-leonid
0
size mismatch for emb.weight: copying a param with shape torch.Size([50277, 4096]) from checkpoint, the shape in current model is torch.Size([886, 4096])
#27 opened a year ago by aizpy
1
KeyError: 'RWKV_JIT_ON'
#21 opened a year ago by bello7777
3
RuntimeError: Error building extension 'wkv_1024_bf16'
#23 opened a year ago by vision-zhao
0
问题已解决：多卡训练时出现core dumped
#18 opened a year ago by magicuter
2
RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)
#22 opened a year ago by dandeperson
0
finetune for machine translation
#19 opened a year ago by muhammed-saeed
0
shape error in RWKV-4-Raven-3B-v7-ChnEng-20230404-ctx2048.pth
#13 opened a year ago by newcworld
7
txt数据集格式
#17 opened a year ago by Leoeeeeeeea
0
问题已解决：cpu+fp32运行chat.py时报错RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
#16 opened a year ago by ChinesePainting
1
【feature advice】Int8 mode to run original model
#15 opened a year ago by LiuLinyun
0
微调后运行，AttributeError: 'types.SimpleNamespace' object has no attribute 'emb'
#14 opened a year ago by aolerv
1
When using lora to run the rwkv fine-tuning in docker environment, an error is reported, using the enwik8 dataset (about 100m in size), and the following error is reported
#12 opened a year ago by weikeltf
2
size mismatch for emb.weight: copying a param with shape torch.Size([50277, 2048]) from checkpoint, the shape in current model is torch.Size([85, 2048]).
#10 opened a year ago by miandui-WuBo
1
GPU requirements?
#6 opened a year ago by Itto1992
2
v4neo 微调trian.py貌似没有更新一些参数用不了例如 'Namespace' object has no attribute 'strategy'
#2 opened a year ago by UnstoppableCurry
4
Gradient Checkpointing requires JIT off, does it need in original RWKV?
#4 opened a year ago by tiendung
2
can you give some data examples about training chinese model
#3 opened a year ago by ZTurboX
1
Correct way to run?
#1 opened 2 years ago by fullstackwebdev
1