BlinkDL/RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

PythonApache-2.0

Issues

wkv的操作为什么要这么设计呀？
#269 opened 24 days ago by NanakiC
1
FP32/FP16精度训练
#271 opened 13 days ago by KompressorSC
1
MQAR评估的问题.
#272 opened 6 days ago by necrophagists
1
RWKV支持对prefill过程中的统一前缀进行cache的操作吗
#270 opened 21 days ago by Lier007
2
RWKV-v4 training doesn't stop after max_epochs defined
#266 opened a month ago by shamilajeewantha
2
Where's the cuda backward function for v7?
#263 opened 2 months ago by bmilde
1
How to call the fine-tuned model like using an API?
#267 opened a month ago by jieli9626
1
With rwkv-V4, If I wish to make an encoder decoder model for example to be used in translation, what are the hidden states that needs passing between the encoder and the decoder? Can you provide some guideline on this matter or any existing work?
#268 opened a month ago by shamilajeewantha
0
对话数据怎么设置不对别人说的话训练？
#264 opened a month ago by petergaoshan
3
关于RUN_CUDA_RWKV6这部分，最好用pytorch实现，否则不方便移植
#252 opened 3 months ago by bobo-wmdigit
5
RWKV .pth to.onnx
#260 opened 2 months ago by momocoQAQ
2
使用rwkv_v6_demo中的init_params报错
#262 opened 2 months ago by KompressorSC
2
请问huggingface transformers的库实现的RWKV是否有些问题？我在backward的时候总是出现问题。
#221 opened 10 months ago by Youngluc
3
rwkv在rag任务上效果怎么样
#261 opened 2 months ago by ZTurboX
1
RWKV 5 supported vLLM？LMdeploy？TGI？Fastllm？FasterTransformer？
#232 opened 8 months ago by lanzhoushaobing
2
论文公式写错了
#259 opened 2 months ago by KompressorSC
1
XXX is currently not supported in Torchscript: 我不知道如何解决这个问题 there is something wrong with cuda in my device
#258 opened 2 months ago by LeC-Z
1
Probable mistake in Eq. 19 in the arxiv paper "Eagle and Finch"
#254 opened 3 months ago by oliverYoung2001
3
Please add rocm support
#247 opened 5 months ago by Wintoplay
1
RWKV only show lower GPU memory occupancy when inference?
#250 opened 4 months ago by thucz
3
跑rwkv_v6_demo.py报错
#256 opened 2 months ago by supercyt
2
请问100多种语言支持是哪100种，有评测过哪些语言的翻译效果是实际可用的吗？
#255 opened 3 months ago by i18nsite
3
Please tell me how to solve the error reported during the use of rwkv ”CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`“
#253 opened 3 months ago by songjie1121
2
lightning_fabric.utilities.exceptions.MisconfigurationException: Unknown configuration for model optimizers.
#251 opened 3 months ago by blueridanus
2
如何使用state tuning rwkv6-7B?
#246 opened 6 months ago by xinyinan9527
4
The device of model.w["emb.weight"] is in CPU
#249 opened 4 months ago by MarshtompCS
1
RWKV替换模型中的RNN
#242 opened 6 months ago by hulucky1102
1
Zero-division error when args.n_layer = 1, caused by ratio_0_to_1. Can I set ratio_0_to_1 = 0 when n_layer = 1?
#243 opened 6 months ago by zdxdsw
1
How to understand u vector in the origin paper?
#244 opened 6 months ago by 141forever
1
RuntimeError: invalid unordered_map<K, T> key
#248 opened 5 months ago by Lixuanhe
1
Probable mistake in Eq. 16 in the preprint
#238 opened 7 months ago by zeyun-zhong
2
bug in new wkv6state_cuda
#241 opened 7 months ago by SmerkyG
0
Flash Attention
#239 opened 7 months ago by fakerybakery
2
The /v1/embeddings interface of rwkv is inconsistent with the /v1/embeddings interface of openai. How should they be compatible?
#240 opened 7 months ago by qq378488249
0
How does the generation speed of RWKV-5/6 compare to that of mamba with the same number of parameters?
#236 opened 7 months ago by h-zhao1997
1
Can RWKV beat Flash Attention?
#235 opened 7 months ago by yxchng
1
NCCL watchdog thread terminated with exception: CUDA error: an illegal memory access was encountered
#237 opened 7 months ago by ZetangForward
1
Tokenizer for fine tuning RWKV-v5 world model
#230 opened 9 months ago by mathewchris96
1
How to understand `no` variable in cuda code?
#234 opened 8 months ago by yxchng
1
how to train For long context
#233 opened 8 months ago by EasonXiao-888
1
Does RWKV-4 music use the RWKV-v4 network architecture?
#231 opened 8 months ago by zzczzc20
1
KeyError: "attribute 'weight' already exists"
#229 opened 9 months ago by ByUnal
1
fintune RWKV5-7B Missing key(s) in state_dict:
#228 opened 9 months ago by liuao743
1
Can RWKV-v4 handle summarization tasks?
#227 opened 9 months ago by zzczzc20
1
能否提供huggingface 上的全部RWKV v5模型的微调参数？
#226 opened 9 months ago by lantudou
1
Finetuning RWKV-5-World-1B5-v2 model
#225 opened 9 months ago by ArchanaNarayanan843
1
Truncation in Tokenizer?
#224 opened 9 months ago by sedrick-keh-tri
1
RWKV for Text to Speech use case
#222 opened 9 months ago by rishikksh20
2
RWKV-5 World on colab
#223 opened 9 months ago by EnricoBeltramo
1
'types.SimpleNamespace' object has no attribute 'time_first'
#219 opened 10 months ago by legends-7
1