GrumpyZhou/patch2pix

Question regarding the loss definitions

Parskatt opened this issue · 3 comments

Hi,

Looking here:
https://github.com/GrumpyZhou/patch2pix/blob/main/train_patch2pix.py#L138-L200

cdist is used for the medium level prediction, and mdist is used for the fine level.
I'm not quite sure I follow the logic here. Why is mdist and fdist not used for the medium/fine level?

Hi,

So cdist measures the Sampson errors of the raw coarse NCNet matches. And the we use those errors (cdist) to define the GT for positive and negative proposals for the classification task (of the mid-level refinement) and to apply match regression only on those positive cases. Similar for the mdist, which represents the Sampson errors of the input proposals of the fine-level refinement.
I hope this is helpful.

Your explanation makes sense. However, I'm still a bit confused as to why you don't use the medium/fine predictions as GT, i.e. why do you use the level below?

We did the upper level which coming from the intuition whether a proposal is good or not. You can also use directly the output level and in this case you score says how confident the regressed matches is a good one. Both are valid to me.