rmislam/PythonSIFT

question about generating img octave

pengbohua opened this issue · 2 comments

Nice work! One question about line 78. Why do we use the image in the middle of gaussian_images_in_octave for downsampling instead of the last one? Is this for some special reason?
Screenshot 2020-06-18 at 21 31 27

Hi @pengbohua, great question! In the original paper (https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf) on page 7, note this sentence in the middle of the page:

"Once a complete octave has been
processed, we resample the Gaussian image that has twice the initial value of σ (it will be 2
images from the top of the stack) by taking every second pixel in each row and column."

2 images from the top of the stack -- in order words, the third to last image.

One octave here actually covers more than the doubling of the initial sigma value. We divide the octave into s intervals, so let's say for s = 2, we would get 3 images:

The original image
The original image blurred by 2 ^ (1 / s) = 2 ^ (1 / 2)
The original image blurred by 2 ^ (2 / s) = 2 ^ (2 / 2) = 2, in other words double blur

However, we're going to do difference-of-Gaussians later, so we need an image one interval below the original image, and one image one interval above the doubled image.

In total, we have s + 3 = 2 + 3 = 5 images. This is where Lowe gets the s + 3 expression from.

Note that this is different from the order in which we actually compute the intervals. Instead of starting the octave at blur factor 2 ^ (-1 / s) and ending the octave with 2 ^ (3 / s), we instead just start with the original image, and end with blur factor 2 ^ (4 / s). Lowe does this just because it makes the code look a little cleaner.

I hope this explanation is sufficient. I'm going to close this issue, but feel free to reopen it if you feel your question has not been answered.