multi-GPU version

Question

multi-GPU version

MichaelYu781 opened this issue 2 years ago · 4 comments

hi, @stepankonev !
Current code is with single gpu and the training speed is relatively slow (about 1.5 iteration per second from our side).
Do you have multi-GPU version?

Answer 1 · 2022-07-10T11:07:27.000Z

Hi, @MichaelYu781 !
I was training the model on the single GPU so the multi-GPU version is not provided.
Best,

Answer 2 · 2022-07-10T15:41:22.000Z

Got it. Thank you, sir!

Answer 3 · 2022-07-18T15:30:35.000Z

Hi, @stepankonev !
What will the loss value finally converges to? In case the loss value is always positive, it will converges to zero. But when i train this model, the loss is negative. Although it gets down, but i don't know to which value it means the training model is close to completion. The attached photo is my training log, is this normal?
Thanks for your time!

Answer 4 · 2022-07-21T23:33:43.000Z

Hello! Please, let's keep the discussion clean. This issue should be devoted to the multi-gpu version of the model. It would also be great if you share the config in the other issue, previously checking for duplicates. Thanks!
PS I believe now it should have converged to some point