BlinkDL/RWKV-LM

how to train For long context

EasonXiao-888 opened this issue · 1 comments

when i train a rwkv-v4 for 4096 context length, it takes error in
if seq_len > rwkv_cuda_kernel.max_seq_length: raise ValueError( f"Cannot process a batch with {seq_len} tokens at the same time, use a maximum of " f"{rwkv_cuda_kernel.max_seq_length} with this model." )

change T_MAX in model.py