RWKV/RWKV-infctx-trainer
RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!
Jupyter NotebookApache-2.0
Issues
- 1
Is the state gradient not implemented yet for the CUDA kernel? (hence bptt_truncated_learning still forced to be True?)
#102 opened by shouldsee - 0
How do I build my own training datasets?
#98 opened by cgoxopx - 1
- 2
[Need help] Implement BPTT cuda code for V5
#64 opened by PicoCreator - 0
- 1
[Feature] Dataset streaming
#62 opened by PicoCreator - 1
[Feature] LoRA support
#20 opened by PicoCreator - 1
- 2
[Feature request]Please eliminate CUDA dependency
#36 opened by yynil - 0
[Feature] Support for microbatches
#25 opened by PicoCreator - 3
"\n" is escaped to "\\n" in preload_datapath.py
#34 opened by cahya-wirawan - 4
bug w.r.t. to multicolumn masking
#35 opened by jprobichaud - 1
- 0
Any pretrained models avaliable?
#41 opened by BrightXiaoHan - 1
- 4
- 1
[feature] positional loss bias support
#1 opened by PicoCreator - 0
- 2
- 0
- 3
dataload做了全核心数的num_workers是否有必要
#7 opened by diannaojiang - 0
[feature] handle multiple dataset seperately, and log loss on them seperately
#15 opened by PicoCreator