mcmingchang/yolox_keypoint_segment

After one epoch of successful training, get this error message. what could be the reason?

PurvangL opened this issue · 0 comments

2023-05-01 15:16:45 | INFO | yolox.core.trainer:259 - epoch: 2/300, iter: 250/313, mem: 11174Mb, iter_time: 3.676s, data_time: 3.100s, total_loss: 8.0, iou_loss: 3.3, l1_loss: 0.0, conf_loss: 3.4, cls_loss: 0.9, seg_loss: 0.3, lr: 1.294e-03, size: 384, ETA: 4 days, 0:29:34

Traceback (most recent call last):
File “/opt/conda/lib/python3.9/runpy.py”, line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File “/opt/conda/lib/python3.9/runpy.py”, line 87, in _run_code
exec(code, run_globals)
File “/workspace/tools/train.py”, line 191, in
launch(
File “/workspace/yolox/core/launch.py”, line 82, in launch
mp.start_processes(
File “/opt/conda/lib/python3.9/site-packages/torch/multiprocessing/spawn.py”, line 198, in start_processes
while not context.join():
File “/opt/conda/lib/python3.9/site-packages/torch/multiprocessing/spawn.py”, line 140, in join
raise ProcessExitedException(
torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with signal SIGKILL
root@e7968c684346:/workspace# /opt/conda/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 8 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d ’