yuantn/MI-AOD

Error in training with SSD

Closed this issue · 1 comments

Traceback (most recent call last):
File "tools/train.py", line 257, in
main()
File "tools/train.py", line 170, in main
distributed=distributed, validate=(not args.no_validate), timestamp=timestamp, meta=meta)
File "/media/gc/1T/MI-AOD-master/mmdet/apis/train.py", line 120, in train_detector
runner.run(data_loaders_L, cfg.workflow, cfg.total_epochs)
File "/home/gc/anaconda3/envs/miaod/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 161, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/gc/anaconda3/envs/miaod/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 33, in train
outputs = self.model.train_step(X_L, self.optimizer, **kwargs)
File "/home/gc/anaconda3/envs/miaod/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 31, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/media/gc/1T/MI-AOD-master/mmdet/models/detectors/base.py", line 228, in train_step
losses = self(**data)
File "/home/gc/anaconda3/envs/miaod/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/media/gc/1T/MI-AOD-master/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
File "/media/gc/1T/MI-AOD-master/mmdet/models/detectors/base.py", line 162, in forward
return self.forward_train(x, img_metas, **kwargs)
File "/media/gc/1T/MI-AOD-master/mmdet/models/detectors/single_stage.py", line 83, in forward_train
losses = self.bbox_head.forward_train(x, img_metas, y_loc_img, y_cls_img, y_loc_img_ignore)
File "/media/gc/1T/MI-AOD-master/mmdet/models/dense_heads/base_dense_head.py", line 58, in forward_train
L_det_1 = self.L_det(*loss_inputs, y_loc_img_ignore=y_loc_img_ignore)
File "/media/gc/1T/MI-AOD-master/mmdet/models/dense_heads/ssd_head.py", line 228, in L_det
assert torch.isfinite(all_y_f).all().item(), 'classification scores become infinite or NaN!'
AssertionError: classification scores become infinite or NaN!

Hello. When I use SSD to train my dataset, this problem will appear after training for a period of time.

Please check if there is a problem with the annotation of your dataset, adjust the learning rate or set the warmup parameters, etc.

This problem is not related to the MI-AOD method.