hustvl/QueryInst

multi-gpus training stuck when "Groudtruth Not Founded!"

thangnx183 opened this issue · 1 comments

im training on my own datasets. i got this log Groudtruth Not Founded!, it doesnt seem like a bug code. The training process just stuck there, no more running .

Groudtruth Not Founded!
Groudtruth Not Founded!
Groudtruth Not Founded!
Groudtruth Not Founded!
Groudtruth Not Founded!

no more log from there.

i tried same setting but run on single gpu this time, still got same notification and it keep running, seem work fine

Groudtruth Not Founded!
Groudtruth Not Founded!
Groudtruth Not Founded!
Groudtruth Not Founded!
Groudtruth Not Founded!
Groudtruth Not Founded!
Groudtruth Not Founded!
Groudtruth Not Founded!
Groudtruth Not Founded!
2021-06-01 11:03:43,099 - mmdet - INFO - Epoch [1][300/72815]   lr: 7.493e-06, eta: 58 days, 7:54:12, time: 1.307, data_time: 0.009, memory: 19179, stage0_loss_cls: 1.3532, stage0_pos_acc: 94.0000, stage0_loss_bbox: 0.7737, stage0_loss_iou: 0.9962, stage0_loss_mask: 4.7144, stage1_loss_cls: 1.5701, stage1_pos_acc: 94.0000, stage1_loss_bbox: 0.7169, stage1_loss_iou: 0.9065, stage1_loss_mask: 4.1660, stage2_loss_cls: 1.2717, stage2_pos_acc: 94.0000, stage2_loss_bbox: 0.6249, stage2_loss_iou: 0.8356, stage2_loss_mask: 3.8767, stage3_loss_cls: 1.3196, stage3_pos_acc: 94.0000, stage3_loss_bbox: 0.6106, stage3_loss_iou: 0.8257, stage3_loss_mask: 4.3968, stage4_loss_cls: 1.2100, stage4_pos_acc: 94.0000, stage4_loss_bbox: 0.5954, stage4_loss_iou: 0.7888, stage4_loss_mask: 4.1171, stage5_loss_cls: 1.2037, stage5_pos_acc: 94.0000, stage5_loss_bbox: 0.6028, stage5_loss_iou: 0.7866, stage5_loss_mask: 4.1907, loss: 42.4537

#3 found answer.