dist_train keep waiting
FX-STAR opened this issue · 4 comments
FX-STAR commented
My env:
cuda10.2
torch==1.6.0
mmdetection==2.8.0
mmcv==1.2.4
After some iters the GPU-Util 100% but the process is always waiting
Could you provide your env or any advice?
hyz-xmaster commented
Hi, I didn't run into this problem so can't provide effective solutions. The code is tested with: cuda 10.1, pytorch 1.6.0, mmdet 2.5.0, and mmcv=1.1.5. You may have a look at this page for more information about training.
oym050922021 commented
@whoNamedCody ,hi,has the problem been resolved? I also faced this problem.
FX-STAR commented
@whoNamedCody ,hi,has the problem been resolved? I also faced this problem.
no, i think the problem maybe in 'GiouLoss', but not debug yet
oym050922021 commented
hi,
Thank you very much for your reply! The problem has been solved.
At 2021-03-09 09:14:10, "FuxingLeng" <notifications@github.com> wrote:
@whoNamedCody ,hi,has the problem been resolved? I also faced this problem.
no, i think the problem maybe in 'GiouLoss', but not debug yet
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.