rmislam/PythonSIFT

How to implement dense sift?

helloMLNNVR opened this issue · 2 comments

Hello, I am trying to implement denseSIFT with your code. What would I change in order to generate the keypoints in a grid format as opposed to the local minima/extrema? Please let me know if you can help me with this!

Hey there, thanks for looking at PythonSIFT! Dense SIFT is an interesting idea. It should be fairly straightforward to modify PythonSIFT to do this. Basically, you'll just need to modify findScaleSpaceExtrema() (and it's probably a good idea to rename this function). You can try something like this:

def computeDenseKeypoints(gaussian_images, dog_images, num_intervals, sigma, image_border_width, step_size=5):
    """You can set step_size to whatever you'd like for your keypoint grid spacing
    """
    keypoints = []

    for octave_index, dog_images_in_octave in enumerate(dog_images):
        for image_index, (first_image, second_image, third_image) in enumerate(zip(dog_images_in_octave, dog_images_in_octave[1:], dog_images_in_octave[2:])):
            # (i, j) is the center of the 3x3 array
            for i in range(image_border_width, first_image.shape[0] - image_border_width, step_size):
                for j in range(image_border_width, first_image.shape[1] - image_border_width, step_size):
                    keypoint = KeyPoint()
                    keypoint.pt = (j * (2 ** octave_index), i * (2 ** octave_index))
                    keypoint.octave = octave_index + (image_index + 1) * (2 ** 8)
                    keypoint.size = sigma * (2 ** ((image_index + 1) / float32(num_intervals))) * (2 ** (octave_index + 1))
                    keypoint.response = second_image[i, j]
                    keypoints_with_orientations = computeKeypointsWithOrientations(keypoint, octave_index, gaussian_images[octave_index][image_index + 1])
                    for keypoint_with_orientation in keypoints_with_orientations:
                        keypoints.append(keypoint_with_orientation)
    return keypoints

Note that we use image_index + 1 instead of image_index because we want the second image in every triplet of adjacent images.

You can delete the functions isPixelAnExtremum(), localizeExtremumViaQuadraticFit(), computeGradientAtCenterPixel(), and computeHessianAtCenterPixel(). Things become much simpler.

The rest of the code should require little or no modification. Give it a try. However, even if it works, most likely it will be very slow because of how many keypoints you'll have. For this to be practical, you'll want to vectorize / parallelize the code somehow.

Hope this helps!

I believe this issue has been resolved, so I'm going to close it. Please feel free to reopen this issue if you feel your question hasn't been answered.