RWKV/RWKV-infctx-trainer

RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!

Jupyter NotebookApache-2.0

Issues

Is the state gradient not implemented yet for the CUDA kernel? (hence bptt_truncated_learning still forced to be True?)
#102 opened 2 months ago by shouldsee
1
How do I build my own training datasets?
#98 opened 5 months ago by cgoxopx
0
Is the training of the RWKV-6 model not yet applicable?
#86 opened 8 months ago by wooks186
1
[Need help] Implement BPTT cuda code for V5
#64 opened 8 months ago by PicoCreator
2
[Feature] Support byte wise / character level encoding
#67 opened 8 months ago by PicoCreator
0
[Feature] Dataset streaming
#62 opened 8 months ago by PicoCreator
1
[Feature] LoRA support
#20 opened a year ago by PicoCreator
1
Optimizing model loading memory bump, for multi GPU training
#28 opened a year ago by PicoCreator
1
[Feature request]Please eliminate CUDA dependency
#36 opened a year ago by yynil
2
[Feature] Support for microbatches
#25 opened a year ago by PicoCreator
0
"\n" is escaped to "\\n" in preload_datapath.py
#34 opened a year ago by cahya-wirawan
3
bug w.r.t. to multicolumn masking
#35 opened a year ago by jprobichaud
4
[Feature] Add (kilo) tokens / second into the reporting
#26 opened a year ago by PicoCreator
1
Any pretrained models avaliable?
#41 opened a year ago by BrightXiaoHan
0
[Feature] Run OpenAI compatible API server for eval harness
#30 opened a year ago by PicoCreator
1
Suggestion: option to only keep last.ckpt and auto delete older ones
#13 opened a year ago by h-a-s-k
4
[feature] positional loss bias support
#1 opened a year ago by PicoCreator
1
Suggestion: trainer should do warning for low disk space
#27 opened a year ago by PicoCreator
0
Unable to finetune official RWKV-5 models
#16 opened a year ago by General-Redshift
2
[Feature] positive / negative data support (aka RFHL)
#21 opened a year ago by PicoCreator
0
dataload做了全核心数的num_workers是否有必要
#7 opened a year ago by diannaojiang
3
[feature] handle multiple dataset seperately, and log loss on them seperately
#15 opened a year ago by PicoCreator
0