hustvl/MIMDet

FloatingPointError: Predicted boxes or scores contain Inf/NaN. Training has diverged.

mike-huangdj opened this issue · 1 comments

Hello, did you solve this problem? @junchen14 ,I'm having the same problem.
When my default configuration: "python lazyconfig_train_net.py --num-gpus 1 --config-file configs/mimdet/mimdet_vit_base_mask_rcnn_fpn_sr_0p5_800_ 1333_4xdec_coco_3x.py --num-machines 1".
Training has diverged.", "FloatingPointError: Predicted boxes or scores contain Inf/NaN.
I should follow the complete MAE pre-training weights provided by the author@Yuxin-CV , but I am confused, I did not find the training weights for the decoder in question, where is this decoder? How should I place it?

The details are shown below:
When running: "python lazyconfig_train_net.py --num-gpus 1 --config-file configs/mimdet/mimdet_vit_base_mask_rcnn_fpn_sr_0p5_800_ 1333_4xdec_coco_ 3x.py --num-machines 1", appears:
11
22
33

And I found the definition of encoder and decoder in "mimdet_vit_base_mask_rcnn_fpn_sr_0p5_800_ 1333_4xdec_coco_3x.py --num-machines 1", as shown in the figure:
44

For the encoder weights, modify them in common.py,
55

I look forward to your reply, if there is any disturbance, please bear with me, sincerely.

@mike-huangdj Hello, can you run with a GPU? I see you didn't change the encoder and decoder parameters.
image