Blealtan/RWKV-LM-LoRA
RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
PythonApache-2.0
Issues
- 6
如何微调提升特定领域知识
#20 opened by magicuter - 1
- 0
ninja: no work to do
#55 opened by ByUnal - 1
[Feature Request] RWKV-v5 support
#53 opened by l1006986533 - 0
报错:AttributeError: 'MMapIndexedDataset' object has no attribute '_bin_buffer_mmap'
#52 opened by Macaron-Lawrence - 0
- 5
Low precision LoRA fine tuning
#5 opened by TehVenomm - 0
Indexing.cu:1141: indexSelectLargeIndex: block: [202,0,0], thread: [92,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
#50 opened by cdg1921 - 0
PSA: Issue with Multi-GPU & CUDA 12.0
#49 opened by PicoCreator - 1
- 0
Stuck in Multigpus lora finetuning
#48 opened by PeiyuZ-star - 0
- 2
ValueError: offset must be non-negative and no greater than buffer length (76025948)
#46 opened by wuzeyuuu - 1
New model World run error, is it support new vocab 'rwkv_vocab_v20230424.txt'?
#43 opened by Helloworld2345567 - 2
Cannot allocate memory
#36 opened by chenhaobupt - 0
Train on CPU
#28 opened by sam-leonid - 1
size mismatch for emb.weight: copying a param with shape torch.Size([50277, 4096]) from checkpoint, the shape in current model is torch.Size([886, 4096])
#27 opened by aizpy - 3
KeyError: 'RWKV_JIT_ON'
#21 opened by bello7777 - 0
- 2
问题已解决:多卡训练时出现core dumped
#18 opened by magicuter - 0
RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)
#22 opened by dandeperson - 0
finetune for machine translation
#19 opened by muhammed-saeed - 7
- 0
txt数据集格式
#17 opened by Leoeeeeeeea - 1
问题已解决:cpu+fp32运行chat.py时报错RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
#16 opened by ChinesePainting - 0
【feature advice】Int8 mode to run original model
#15 opened by LiuLinyun - 1
- 2
When using lora to run the rwkv fine-tuning in docker environment, an error is reported, using the enwik8 dataset (about 100m in size), and the following error is reported
#12 opened by weikeltf - 1
size mismatch for emb.weight: copying a param with shape torch.Size([50277, 2048]) from checkpoint, the shape in current model is torch.Size([85, 2048]).
#10 opened by miandui-WuBo - 2
GPU requirements?
#6 opened by Itto1992 - 4
v4neo 微调trian.py貌似没有更新 一些参数用不了 例如 'Namespace' object has no attribute 'strategy'
#2 opened by UnstoppableCurry - 2
- 1
- 1
Correct way to run?
#1 opened by fullstackwebdev