pre-train with single machine with multi gpus
willard-yuan opened this issue · 3 comments
willard-yuan commented
Following the train_redpajama.md, I try to pretrain single machine with multi gpus. Then I do the following:
python pretrain/redpajama.py --devices 4 --train_data_dir data/lit-redpajama-sample
I got the following error:
File "/yuanshitestvepfs/lit-llama/pretrain/redpajama.py", line 144, in main
for iter_num, train_data in enumerate(train_dataloader):
RuntimeError: generator raised StopIteration
Is there any thing I missed?
LamMoh1 commented
try reducing the number of iterations.