Two questions about keypoint scale
Doom9234 opened this issue · 3 comments
HI, thank you for the great work, it helps me a lot when learning SIFT, I have to 2 questions about keypoint.size
-
When computing keypoint.size in localizeExtremumViaQuadraticFit(), you used octave_index + 1 instead of octave_index, is this because the origin input image is doubled by linear interpolation?
-
In computeKeypointsWithOrientations, the scale is muliplied by 0.5 and divided by 2 ** octave_index, it's a little different from the origin paper, am I missing something?
HI, thank you for the great work, it helps me a lot when learning SIFT, I have to 2 questions about keypoint.size
1. When computing keypoint.size in localizeExtremumViaQuadraticFit(), you used octave_index + 1 instead of octave_index, is this because the origin input image is doubled by linear interpolation? 2. In computeKeypointsWithOrientations, the scale is muliplied by 0.5 and divided by 2 ** octave_index, it's a little different from the origin paper, am I missing something?
I think the scale to compute the orientation should be 2 ** ((image_index + extremum_update[2]) / float32(num_intervals))) * (2 ** (octave_index ))
it's different from the scale in computeKeypointsWithOrientations()
Hi there! Thanks for looking into PythonSIFT so deeply.
Yes, I'm using octave_index + 1
instead of octave_index
because we double the input image size. Later in convertKeypointsToInputImageSize()
, we do keypoint.size *= 0.5
to account for this.
Actually, the way scale
is computed in computeKeypointsWithOrientations()
is consistent with the way keypoints.size
is computed in localizeExtremumViaQuadraticFit()
. Let's compare line 181:
keypoint.size = sigma * (2 ** ((image_index + extremum_update[2]) / float32(num_intervals))) * (2 ** (octave_index + 1))
with line 227:
scale = scale_factor * 0.5 * keypoint.size / float32(2 ** octave_index)
Note that we can rewrite line 227 like this, and it's mathematically equivalent:
scale = scale_factor * keypoint.size / float32(2 ** (octave_index + 1))
.
In fact, I made this change in the latest commit in order to be consistent with line 181. Sorry for the confusion. However, there's nothing wrong with the original line 227. We are simply reversing the * (2 ** (octave_index + 1))
operation done in line 118 to recover the scale. We then multiply by scale_factor
, as mentioned in the paper.
Note that size
here corresponds to the same size
variable used in OpenCV's SIFT implementation, which you can find here:
https://github.com/opencv/opencv_contrib/blob/master/modules/xfeatures2d/src/sift.cpp
If you take a look at lines 572 and 661, you can see they compute size and scale the same way. You can even set up breakpoints in OpenCV's sift.cpp
and PythonSIFT's pysift.py
, and compare the keypoints at each step of the computation and verify they are the same within rounding error. The size
variable isn't mentioned in the original SIFT paper -- it's just a convenient way to store the scale
across different octaves and layers using a single number.
I hope this helps! Does this answer your questions?
I believe this issue has been resolved, so I'm going to close it. Please feel free to reopen this issue if you feel your question hasn't been answered.