Use more than 4 GPU in linear evaluation
utjune opened this issue · 5 comments
Thanks for good code implementation.
I using 8 gpus in 1 node.
There was not a problem when I used 8 gpus in pretrain.
But when I use 8 gpu in linear evaluation, there is a problem.
TypeError: forward() missing 1 required positional argument: 'x'
How can I solve it?
Hi @utjune , sorry for the very late reply. Do you use dp or ddp?
I used dp, because ddp is not implemented in linear evaluation XD.
So when I tried to use 8 GPUs than error has occurred.
does it only happen with 8 gpus, what if you use less gpus? what is your batch size?
Yes, I did not try to use more than 8 gpus, but 4 gpus are ok.
And my batch size is 1024. I tried to use 2048 batch size but the resource is not enough.
8 gpus, 2048 batch size -> error
4 gpus, 1024 batch size -> no problem
4 gpus, 2048 batch size -> resource problem
Thanks for letting me know; this is weird; I'll try to test it on my end when I can access an 8-GPU node.