zwx8981/LIQE

Printing training process suddenly stopped during training

Closed this issue · 6 comments

image
During the training process, the printing training process suddenly stopped. What is the reason?

Not sure, may be similar to #4 (comment). Just stop and try it again.

image
If pin_memory is set to true, the following error will be reported, so I set it to false. Is it related to this?

不确定,可能类似于 #4(评论)。停下来再试一次。
image
If pin_memory is set to true, the following error will be reported, so I set it to false. Is it related to this?
Or, how to solve this problem?

  1. I didn't try setting pin_memory to True. In my case, the training stucking problem occasionally and randomly occurs. Sorry but I think the best pratice to handle this issue would still be retrying.
  1. I didn't try setting pin_memory to True. In my case, the training stucking problem occasionally and randomly occurs. Sorry but I think the best pratice to handle this issue would still be retrying.

alright, thank you very much!

@GitHub-Ju Hi, you may try this. #17 (comment)