Low GPU utilization and high CPU utilization
Closed this issue · 15 comments
Hello, thanks a lot for your codes. The GPU utilization is low, about 50~100%, which is even worse when I run several instance the same time on one GPU server. In the other hand, the CPU utilization is very high, which is more than 300%. I think whether it is the data load process take too many time. Do you have the same problem when you run the code. It took about 11 hours on NYU dataset(i9-9900k, Titan Xp). Thank you!!!
Q1: We use online data augmentation, so in each batch, the code must use CPU to augment data such as rot, trans, scale and generate 2D heat-map of one batch using loop. So, the gpu-util should always change from 0.1 to 1. If you are interested in this part, you can try to optimize it and share the method.
Q2: In the train file, the loop is 200 epoch, in fact , just 110 is enough.
Thank you for your response! I wonder whether to update my GPU from TITAN XP
to 2080Ti
or RTX
in order to speed up the training process?
In my view, the speed may be litttle difference between titan and 2080ti.
In fact, I usually use gtx 1080 to train the model before, just one night is enough, when you are sleep.
And what about your CPU? I find it makes a big difference between Intel i9-9900K(TITAN XP) and XEON series CPU(1080Ti) to train the model. It cost about 11 hours and 15 hours to run 110 epochs, respectively.
I am very interested in the research of your lab. Can you share me the website if it is possible? Thank you very much!
Our lab is just getting started in this filed, so there is no a website of the lab.
What a pity!
This is my personal computer(1080 but not 2080 ti), just one night is enough.
But your computer uses a longer time, It is a very strange thing itself.
I also find your github stars many repositories related to hardware of robotics. Do you want to pay more attention to this field in the future? Thank you!!!
I'm trying to turn to robot grasp, but may be just a preliminary exploration.
I will graduate next year.
Hihi, I am in mechanical school. I also want to focus on object pose estimation especially in robot grasp application.
I change the tensorflow environment from 1.3 to 1.9, the training time drop to 5.5 hours.
Thank you!!!
good luck