low GPU utilization rate

Question

low GPU utilization rate

zoraup opened this issue 4 years ago · 3 comments

Hello~
It's a very nice work!
I wonder why the GPU utilization is always low when I run the code, therefore the speed is slowed down, is there a way to solve the problem? thanks a lot~

Answer 1 · 2020-12-24T03:50:10.000Z

The same problem. When I run the cross-domain experiments on baseline with Conv4-net, I change the batch_size from 16 to 128, while the GPU utilization rate is still very low. To finish 1 epoch, more than 10 minutes needed on a RTX 2080Ti GPU. I wonder how long it takes to run the entire experiment in the original environment. @wyharveychen

Answer 2 · 2020-12-24T08:01:11.000Z

The same problem. When I run the cross-domain experiments on baseline with Conv4-net, I change the batch_size from 16 to 128, while the GPU utilization rate is still very low. To finish 1 epoch, more than 10 minutes needed on a RTX 2080Ti GPU. I wonder how long it takes to run the entire experiment in the original environment. @wyharveychen

I found that in the begining of training, the speed is very slow, while speeding up in the later training procedure.

Epoch 0 | Batch 468/469 | Loss 3.944826: 100%|████████████████████| 469/469 [12:53<00:00,  1.65s/it]
Epoch 1 | Batch 468/469 | Loss 3.422845: 100%|████████████████████| 469/469 [04:53<00:00,  1.60it/s]
Epoch 2 | Batch 468/469 | Loss 3.214282: 100%|████████████████████| 469/469 [01:00<00:00,  7.78it/s]
Epoch 3 | Batch 468/469 | Loss 3.078954: 100%|████████████████████| 469/469 [00:48<00:00,  9.57it/s]

Answer 3 · 2021-01-15T22:30:09.000Z

"speeding up in the later training procedure." Is there a reason why this happened?