The training process of SphericalMask.

Question

The training process of SphericalMask.

RongkunYang opened this issue 8 months ago · 15 comments

Hello, dear authors, thank you for your nice works!
I have tried to train the network, and adopt the ISBNet encoder weight, use the preprocessed data from ISBNet.
However, I can not achieve the performance of the mAP 61, the performance of my training result is 0.545 0.715 0.798 for mAP, AP50 and AP25, so may I ask whether there are some training details I ignored?

Thank you!

Answer 1 · 2024-04-26T01:05:48.000Z

Thanks for your interest. What is your training environment (i.e., torch and Cuda version)? My training/testing environment uses 1.12.1 and 11.3 for torch and CUDA, respectively. Also, can you confirm that you can reproduce the testing result (62.6 mAP) when you use the checkpoint (spherical_mask_155.pth)?

Answer 2 · 2024-04-26T01:07:35.000Z

I will also run the training to confirm again and let you know.

Answer 3 · 2024-04-26T01:11:57.000Z

OK, thank you for your fast response, I use the same environment as yours.

I notice that the config file has an item "pretrain_decoder", is this item important, does it need to add the isbnet weight?

Answer 4 · 2024-04-26T01:17:32.000Z

Yes, both the "pretrain_encoder" and "pretrain_decoder" have to use the same path currently, as "spherical_mask.yaml". It is nice point. I will change this in the code and commit to avoid confusion.

Answer 5 · 2024-04-26T02:04:43.000Z

Oh, I just read the checkpoint loader, the pretrained weight in this repo seems different from the ISBNet pretrain weight, the pretrained weight in this repo have the same parameter size as the Spherical Model,
So may I ask how the pretrained weight is trained, is that use the semantic loss, offset loss and box loss to train the backbone like ISBNet?

Answer 6 · 2024-04-26T02:27:25.000Z

In the training code, the gradient accumulation is set to 16, does the gradient accumulation affect the performance?

Answer 7 · 2024-04-26T18:54:14.000Z

Thanks for your comment.

The pre-trained weight was from the initial release of ISBNet with 2 dynamic convolution layers ( I noticed that the current ISBNet release uses 3 dynamic convolution layers). We stopped ISBNet training when the model reached the best AP around 56 and used it as the pre-trained weight for Spherical Mask.
When I uploaded the pre-trained weight, I think I used the Spherical Mask script somewhere to load the weight and saved it by mistake, which is why some modules for Spherical Mask are included. I will remove them and upload the weight again to avoid confusion.
In case you are wondering, we use the pre-trained weight because the training became unstable (exploding gradient), leading the decoder outputs to NaN when initialized without the pre-trained weight. Interestingly, we found a similar issue in other methods using similar U-Net-based backbone architectures.
Yes, the gradient accumulation helps boost the performance slightly in our experiments.

I will update the code based on your feedback this weekend!

Answer 8 · 2024-04-27T06:26:59.000Z

OK, thank you for enthusiastic answer!

Answer 9 · 2024-04-28T22:45:09.000Z

I just updated the code and confirmed that I could reproduce the result.

Answer 10 · 2024-04-30T01:35:44.000Z

Yes, the code is well, and I also reproduce the result.
The first time when I train, I make a mistake in the checkpoint loading.
Thank you for your enthusiastic answers.

Answer 11 · 2024-05-15T09:54:22.000Z

Hello, @RongkunYang and @yunshin

I read this closed issue and it seems like you have successfully reproduced the experiment. I would like to ask you some questions because I'm having some trouble reproducing.

At the beginning of this issue you said

However, I can not achieve the performance of the mAP 61, the performance of my training result is 0.545 0.715 0.798 for mAP, AP50 and AP25,

are you talking about performances with only train set or train val set?

Have you tried reproducing on the test set?
I have successfully reproduce the result with the validation set. However, I'm having some trouble reproducing with the test set. Could you kindly give me some advice or some help?

Best regard,
Jongwook Kim

Answer 12 · 2024-05-15T10:28:48.000Z

Hello, @frankkim1108 , I haven't try to reproduce the test set result, the result we talk above is trained on training set and validate on the validation set.
Could you describe detailly about your process of reproducing the result on test set

Answer 13 · 2024-05-15T10:55:32.000Z

Hello, @RongkunYang. Thank you for your quick answer.

We pretrained the train-val benchmark backbone model provided from ISBNet and utilized it to train Spherical Mask with train-val dataset.

The test score result is about 55. (Similar to the ISBNet results)

Dear, @yunshin may I ask whether I have ignored some training details ?
I've been trying everything to reproduce your result. Your advice would be very helpful.

Thank you.

Answer 14 · 2024-05-15T11:08:47.000Z

OK, the training process seems to have no problem, I will also try to reproduce the test result, and we wait for the author to share more detail about the test process.

Answer 15 · 2024-08-02T06:04:15.000Z

@frankkim1108 Do you reproduce the result on test set? We also can not reproduce it.