Image Processing - Solving an Object Recognition Problem (CMP3108M_Assessment_01)

Overview

A tool for detecting screws/washers which can be generalised to an idustrial setting, like analysing objects on a conveyor belt. For this purpose we will have a camera mounted straight above the incoming objects. The objects are photographed against a neutral background (white) clearly distinguishable from the objects themselves (metallic grey). In the dataset we have pictures of a mix of fasteners, with the target of identifying them automatically.

Summary of Implementation

Pre-Processing

To being pre-processing, the data is loaded simply by using the ‘imread’ function and converted to grayscale using ‘rgb2gray’. This grayscale image was reduced by half of its original size by bilinear interpolation using the ‘imresize’ function, with the input parameters of ‘0.5’ and ‘bilinear’. To do this, the function finds the average of the four closest (2-by-2 neighbourhood) pixel values to the chosen pixel and uses these as the estimate and new value. Next, the image was required to be enhanced using contrast adjustment, which was done using the ‘imadjust’ function, and the histograms are plotted before and after this process takes place.

Before enchantment, the histogram shows the vast majority of the intensity values in a very narrow range, between around 150-220, which is not ideal. After enhancing the image, the pixel intensity values cover a much broader spectrum of the histogram. This causes the image to appear darker in the darker areas and lighter in those areas with a lower intensity. Finally, binarization can occur and is performed using the ‘imbinarize’ function with the enhanced image and other additional input arguments. The function parameters were changed around through trial and error to find the best arguments for the binarized image. The binarized image would need to ensure only the objects were black and reduce the total amount of background noise being incorporated in, while still detecting all the objects. The main variable that was altered to change the quality of the binarization was the sensitivity value. A value of ‘0.25’ was the best for the chosen image and allowed all the screws and washers to be successfully separated minus one washer. However, this washer kept its outline features, which would be enough for the edge detection to occur.

Edge Detection

Edge detection finds the boundaries of all the objects and is used as a precursor to fully segmenting the objects completed in task 3. MATLAB has a built-in function called ‘edge’ contained within the image processing toolbox and is the used method for this problem. The binarized image is passed into the function along with a detection method and sensitivity threshold. The best and selected method was ‘Canny’, which was found by comparing all the different edge detection method and can be seen in the ‘Task2_Comparison.m’ file. Although, without the additional input argument of sensitivity, the Canny method was not the most successful, and the best method was either Roberts or Sobel, however, these methods were not selected as they had less flexibility in terms of manipulating the threshold. By default, not all the edges were detected. The second set of subplots displayed in this file show the differences between each sensitivity value for the Canny method. The values 0.2 and 0.3 show the edges of the objects relatively well, without losing too much quality and missing some objects, and without adding too much noise. It was decided that a value of 0.25 would be chosen for the sensitivity threshold.

Simple Segmentation

To begin the segmentation process, a morphological structuring object was defined of radius one and disk-shaped. This object is required for the following dilation process. Dilation was chosen for this image to ensure the disc and screw shapes became fully formed and none were missed. The dilation operation takes each pixel that is a neighbour to the object and relabels it as part of the object. While this is an expanding operation, this can be reduced by having a small structuring object defined, which will reduce the likelihood of two objects forming as one. The disk shape is used as this only selects the four pixels directly connected to the central pixel, meaning it will not stretch as far as other shapes, and a radius of one pixel was defined for the same reason.

Once dilation has occurred, the shapes‘ edges are heightened and appear thicker, meaning any gaps in the edge detection will likely be removed. Now the function ‘imfill’ is passed with the parameter ‘holes’ that performs a flood-fill operation on the pixels inside the edges of the objects, which means the pixels are turned from the background colour to white, and segmentation is complete. A binary image showing all the objects‘ separated segments. The bottom left screw and washer are very close together and having a disk shape of radius above one would mean they joined as one object, which is not correct, so keeping it at one allows for these to be segmented successfully.

Object Recognition

The next step taken after successful object segmentation was to find the individual shapes and label them individually. This was achieved by using the ‘bwlabel’ function, which returns a matrix containing zeros and ones, with zeros being the background pixel locations, and ones being the locations of the objects. The numbered values for each shape are added at a later stage. Next, the ‘regionprops’ function is used on the labelled matrix to return the basic information about each shape and passed to the table ‘properties_table’, which is printed to the command window. These variables are then accessed individually and stored as double arrays. For example, ‘I_areas = [I_props.Area]’ would access the area variables. Step-4 in the source code shows the method used to plot the number of each shape. This is done by finding the X and Y values of the centroids and running through a loop to plot each numbered shape at its relative centroid location using the ‘text’ function.

Once the objects have been numerically labelled, and the table is formed, the variables are examined, and it was found that all the screws have an area under 1000 pixels, and the washers all had an area above 1500, making recognition achievable this way. These two conditions are defined, and the locations where this condition is met is stored as a new shape object for both screws and washers. This was achieved using the ‘find’ function for the two matrices to find the positions where they are equal. These newly formed separate shape images are passed to the ‘label2rgb’ function with a custom blue colourmap for the screws and red for the washers. Finally, these are summed together to form the completed object recognition image.

Robust Method

Additionally, a robust method of object recognition was required for the other images in the dataset. When testing the previous method for all the other images, a large amount of the images contained multiple objects incorrectly joined as well as inaccurate recognition. There were also no parameters defined for the large screws. To overcome the issue of objects being formed as one, the image segmentation method was slightly altered. While previously the ‘imdilate’ method was used, ‘imclose’ was used this time as the initial operation, followed by ‘imfill’ again. This was done to decrease further the likelihood of two objects becoming connected. Finally, ‘imopen’ was used to dilate the image and then erode it, giving the best overall result for the dataset. Alongside this change, the Canny method parameters for edge detection were slightly altered to allow for a more precise result. However, unfortunately, some images still contained shapes incorrectly segmented, such as ‘IMG_04’.

Another key alteration for this method, was to change the property used as a threshold for the objects. Previously, the area was used as there was a key difference between the areas of the small screws and washers, however it was found that some of the large screws had similar areas to the washers, so recognition was no longer possible this way. The perimeter value for each shape was chosen as the new method to differentiate the objects. While this unfortunately is a hard threshold value, it worked for the vast majority of the images in the dataset, by having all the objects with a perimeter under 122 classed as small screws, washers between 122 and 250 and everything above that recognised as large screw. All the final recognition images can be seen in the output folder of the project.

astellj/industrial-object-detection