Object Detection with SIFT

Scale Invariant Feature Transform (or SIFT) is a computer vision algorithm that can be used to extract useful keypoints and their descriptors from an image. As the name suggests the algorithm is robust to any scaling of the image and this makes it suitable for object detection. Unlike contemporary object detection algorithms, this does not require large training image datasets.

This code uses the OpenCV library to demonstrate how SIFT can be used for object detection. The algorithm can be deconstructed into the following four steps

Capture an image (Training Image) of the object of interest and crop out any unnecessary details. The code assumes that the user already has it in a predefined folder.
Create a feature extractor to extract the features from the training image.
Create a feature matcher to look for the features in a Query Image.
Draw a bounding box around the detected object.

Usage

To use and test this code, clone the repository and then run the project with Visual Studio in the debug mode. Make sure that the OpenCV library is properly configured with Visual Studio. Detailed instructions on how to do it can be found here.

Alternatively, users can also compile and run the Random Object Detection.cpp file with their own prefered choice of compiler.

The user also needs to provide the correct path to their Training Image in line 26 of the Random Object Detection.cpp file

Mat trainingImg = imread("<user file path>", 0);

Example Results

The following images show the code succesfully detecting an object of interest, in this case it was the book, Astrophysics for People in a Hurry by Neil Neil deGrasse Tyson.

The algorithm does fairly well even for partially occluded objects.

References

Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 60, 91–110 (2004). PDF

dwiwahyudi/object-detection-SIFT

Object Detection with SIFT

Usage

Example Results

References