XiSHEN0220/RANSAC-Flow

Unable to reproduce paper results on HPatches

ducha-aiki opened this issue · 9 comments

Hi,

Thank you for the repository!
I am trying to do some sanity checks before treating to build on top of your work and have some issues with reproducing. Specifically I run

python evaluation.py --outDir ImageNet_WO_FT --transformation Homography --maxCoarse 0 --imageNet
python getResults.py --coarsePth ImageNet_WO_FT_Coarse --finePth ImageNet_WO_FT_Fine

And get the following output:

Scene 6, Average end-point error (EPE) : 8.342
Scene 5, Average end-point error (EPE) : 5.148
Scene 3, Average end-point error (EPE) : 2.912
Scene 2, Average end-point error (EPE) : 1.086
Scene 4, Average end-point error (EPE) : 3.070

As opposed to Table 1, where error is from 0.51 to 5.16:

image

Could you please point out what I am doing wrong?

Hi, thanks for your interest in our project!!!
I did a fresh pull, and re-ran the above commands.
These are what I obtained:
Scene 3, Average end-point error (EPE) : 2.328
Scene 4, Average end-point error (EPE) : 3.189
Scene 6, Average end-point error (EPE) : 5.253
Scene 2, Average end-point error (EPE) : 0.517
Scene 5, Average end-point error (EPE) : 5.348

The RANSAC produces some variance but it should give comparable results.

My torch version is : 1.2.0
torchvision version is : 0.4.0
Kornia version is: 0.1.4.post2

Could you run the command for coarse flow:
python getResults.py --coarsePth ImageNet_WO_FT_Coarse --finePth ImageNet_WO_FT_Fine --onlyCoarse

I would like to see whether the problem is in the coarse alignment or in the fine flow.

Thanks a lot!!!

-Xi

My kornia is the same 0.1.4.post2, but the pytorch is 1.5.0. Probably it is the source of the problem. I will check with 1.2.0 + 0.4.0 and come back.
Thank you a lot!

After re-install of pytorch, torchvision and Pillow, but without feature re-computation:

coarse:

Scene 6, Average end-point error (EPE) : 101.095
Scene 5, Average end-point error (EPE) : 5.759
Scene 3, Average end-point error (EPE) : 3.688
Scene 2, Average end-point error (EPE) : 1.287
Scene 4, Average end-point error (EPE) : 6.356

fine

Scene 6, Average end-point error (EPE) : 7.876
Scene 5, Average end-point error (EPE) : 4.656
Scene 3, Average end-point error (EPE) : 2.332
Scene 2, Average end-point error (EPE) : 0.534
Scene 4, Average end-point error (EPE) : 2.577

So it looks like that the evaluation script, not the extraction, really depends on the library versions...
The run with recomputed features is on the way...

And for both extraction and eval - same as you have. Thanks for your time!

pytorch 1.2.0 + torchvision 0.4.0

Scene 6, Average end-point error (EPE) : 5.183
Scene 5, Average end-point error (EPE) : 5.491
Scene 4, Average end-point error (EPE) : 2.451
Scene 3, Average end-point error (EPE) : 2.367
Scene 2, Average end-point error (EPE) : 0.513

pytorch 1.5.0 + torchvision 0.6.0:

Scene 6, Average end-point error (EPE) : 8.342
Scene 5, Average end-point error (EPE) : 5.148
Scene 4, Average end-point error (EPE) : 3.070
Scene 3, Average end-point error (EPE) : 2.912
Scene 2, Average end-point error (EPE) : 1.086

As a suggestion, maybe it is worth mentioning in readme and/or pinning pytorch/torchvision versions in requirements.sh?

Thanks for the suggestion !!!
It is in the ReadMe, but I probably should highlight it at the begining. I gonna update the doc. Anyway, thanks for pointing it out!!!

After checking the doc of Pytorch, there might have a difference about the function F.grid_sample.
F.grid_sample in Pytorch 1.2.0
F.grid_sample in Pytorch 1.5.0

Also mentioned in stackoverflow

Not sure this is the pb, I will carefully check it and come back to you.

If the difference indeed is because of difference 'align_corners' defaults, you could pass the parameter and get the same results regardless the version.

P.S. And if the reason is the "align_corners=True" vs "align_corners=False", I believe in the new results more, tbh. Of course, I don't expect ranking to be changed, just the absolute values.

There is no parameter align_corners in Pytorch 1.2.0.

For your question, I think it is more about the training-test consistency.
During the training, the flow is learned with F.grid_sample by setting align_corners=True.
At the testing time, it would be better to do the same operation.

There is no parameter align_corners in Pytorch 1.2.0.

My bad. It is hardcoded to True somewhere in backend then.

For your question, I think it is more about the training-test consistency.

Sorry, I don't quite follow. The features are off-the-shelf ImageNet, aren't they?

The imageNet feature is employed to estimate homographies.
F.grid_sample is then used to warp image. (here start to be diff)

Fine flow is estimated from the coarsely aligned pair.
But to obtain the final flow, I still need to use F.grid_sample on the coarse flow.

F.grid_sample has been largely used in my training code to learn the fine flow.