eric-yyjau/pytorch-superpoint

training parameters

Opened this issue · 12 comments

Thanks for the work !

I am training my own network. But I can not get the same result as yours.
When I training the magicpoint, the pr is not steady. It could be 0.15 in this 2k iters, then 0.25 in the next 2k iters.
Could you please provide your training parameters?
Do you change you learning rate during training? And do you use same img size 240*320 in export and training with COCO?

Cheers!

Hi @YanShuo1992 ,

Thanks for your message.
To generate keypoints for training, I suggest directly using the pre-trained network for homography adaptation (https://github.com/eric-yyjau/pytorch-superpoint/tree/master/logs/magicpoint_synth20/checkpoints).
For the parameters, you may refer to this one (https://github.com/eric-yyjau/pytorch-superpoint/blob/master/logs/superpoint_coco_heat2_0/config.yml)

I didn't change the learning rate. Yeah, image size should be 240*320.
Best,

Cheers @eric-yyjau
I follow the same config and get a good result in my task. However, I find an interesting phenomenon.
When I train the model, the training and validation precision drop steeply. Same thing happens in recall.
valprecision

They will drop during about 5w-10w iters. Then the precision climb to a reasonable value until 17w iters.

Did you find same phenomenon? and any commits?

Hi @YanShuo1992 ,

Yeah I remember I saw this issue before but I don't know what caused this.
I think some pytorch version will have this issue.

If you know how to resolve this, please also let me know.
Thanks

35p32 commented

Thanks for your great work! @eric-yyjau and really admire you。
But there is something wrong during training MagicPoint on Synthetic Shapes,I got the same problem as @YanShuo1992
I use " torch1.9.0+cu111 " and the training precision and loss looks strange:

1
2
3
4

Is this caused by the version? (if yes, What version do you recommend?) or something else ?

Hi @35p32,

Thanks for your question. Sorry I don't know why it happened.
Maybe it is because of the pytorch version. If you have a way to prevent this from happening, please share with me.
Thank you!

Hi,
I encountered the same problem. AMSGrad seems to have solved it for me:
optimizer = optim.Adam(net.parameters(), lr=lr, betas=(0.9, 0.999), amsgrad=True)

https://pytorch.org/docs/stable/generated/torch.optim.Adam.html#torch.optim.Adam

@johnnyboloni hi, i'm trying to reproduce the result. Can you share your loss/metric curve and your software environment info please~thx

@XhqGlorry11 Hey, I'm using torch1.10.1+cu102 and python 3.6.13.
In orange is the original training, in red is after using AMSGrad.
image

BTW, I'm now investigating why precision and recall so bad, even without the weird error:
image

Do you get similar behavior?

@johnnyboloni Thank you for your reply! I had a little problem in convergence. I want to re-confirm the training details with you.
I download COCO pseudo ground truth and train using config with some modifications.
image
image
How about your settings?

@XhqGlorry11 I'm trying to train MagicPoint from scratch, I've used the SuperPointNet_gauss2 in the beginning, but now trying SuperPointNet. (I've used magicpoint_shapes_pair.yaml)

@johnnyboloni I didn't train magic point from scratch. I directly use COCO pseudo ground truth and train SuperPointNet. Here is my metric curve.
image
image
image

@johnnyboloni I didn't train magic point from scratch. I directly use COCO pseudo ground truth and train SuperPointNet. Here is my metric curve. image image image

is this trannning data in tensorborad is as good as you expect?