xinge008/Cylinder3D

Stuck when start training

mc171819 opened this issue · 3 comments

Hi, I tried to run your training code, however, I cannot start training successfully. I have stucked here for a long time. Do you have advice to start training?
image
@xinge008

The CPU occupation is high now but gpu occupation is low. Other project work fine on gpu so it is not my gpu problem. This is the output of nvidia-smi
image

after waiting for about half an hour, it start training and all things seems normal. It seems epoch 0 iter0 is not on gpu training. It's an weird problem.
image

The problem is the validation of iter 0 ep 0 before training, add global_iter>0 on validation solute it.