Computer Vision

This project consists of several tasks related to computer vision, including boundary detection, face detection and recognition, feature detection and image matching, image processing, and image segmentation. Each section below provides images and explanations of the tasks implemented in the project.

Sections


Image Preprocessing

This is the initial step in the image processing chain, where raw image data is enhanced to improve its quality and remove distortions or artifacts. It prepares the image for further analysis by applying various correction techniques.

I. Noise Generation

Users can manipulate images by introducing various types of noise using a convenient combo box interface. This feature allows for experimentation with different noise models, providing insights into the impact of noise on image quality and subsequent processing algorithms.

   1. Uniform Noise

Unifrom_noise

   2. Gaussian Noise

gaussian_noise

   3. Salt & Pepper Noise

saltandpaper

II. Image Filtering

We implemented various filters to enhance image quality and reduce noise (e.g. average, gaussian, median filter). We experimented with different kernel sizes, specifically 3x3, 5x5, and 7x7, to observe their effects on the filtration process.

   1. Average Filter Applied on Uniform Noise

Average

   1. Gaussian Filter Applied on Uniform Noise

gaussian

   1. Median Filter Applied on Uniform Noise

median

III. Edge Detection

Identifying the boundaries or outlines of objects in the image by detecting changes in intensity.

   1. Sobel

sobel

   2. Roberts

roberts

   3. Prewitt

prewitt

   4. Canny

canny

IV. Histograms

The histogram of an image gives a graphical representation of the distribution of pixel intensities within that image. It plots the frequency of occurrence of each intensity value along the x-axis, with the corresponding number of pixels having that intensity value along the y-axis. This allows us to visualize how light or dark an image is overall.

  • For a darker image

Observation: The histogram shows a higher concentration of pixels with lower intensity values, indicating a predominance of dark areas.

darker image

The entire GUI is displayed which contains a histogram, distribution curve, RGB histogram, cumulative curve

histogram

  • For a brighter image

Observation: We notice a more balanced distribution of intensity values, with a broader spread across the intensity axis. This indicates a wider range of brightness levels present in the image.

brighter

Also the entire GUI is displayed which contains a histogram, distribution curve, RGB histogram, cumulative curve

brighter histo

  • Distribution Curve

The distribution curve represents the frequency distribution of pixel intensities across the entire image. Peaks in the distribution curve indicate dominant intensity levels, while valleys suggest less common intensity levels.

Observation: We noticed that the curve was skewed towards the higher end of the intensity axis, indicating that the image contains a significant number of bright pixels.

distribution

  • Histogram equalization

Histogram equalization is a technique used to improve the contrast in an image. It operates by effectively spreading out the most frequent intensity values, i.e., ‘stretching out’ the intensity range of the image. This method usually increases the global contrast of images when its usable data is represented by close contrast values.

   1. With YCrCb color space

Observation: The image appears slightly desaturated and contrasted.

YCrCb color space

   2. With HSV color space

Observation: The image appears more vibrant and bright. The colors are more distinct and saturated, which might make certain details stand out more.

HSV color space

V. Normalization

Adjusting the pixel values of an image so that they fall within a specific range or distribution. Normalization helps reduce the effect of variations in illumination, enhancing the contrast of images, and improving the performance of algorithms that operate on image data.

normalization

VI. Thresholding

   1. Global Thresholding

It is a simple and widely used technique in image processing for segmenting an image into two regions: foreground (object of interest) and background. The goal is to find a single threshold value that separates these two regions based on pixel intensity.

Global Thresholding Parameter:

  • Threshold value = 109.0

global

   2. Local Thresholding

It takes a more nuanced approach. Instead of using a single threshold for the entire image, it computes different thresholds for different regions based on the local characteristics of each pixel’s neighborhood.

Observation: Local thresholding tends to preserve fine details

Global Thresholding Parameter:

  • Window Size = 40px

local

VII. Frequency Domain Filter

Finding the frequency components of an image through Fourier analysis. The frequency domain provides information about the spatial frequencies present in an image, such as low-frequency components representing smooth areas and high-frequency components representing edges or fine details.

   1. Low-pass Filter

low pass

   2. High-pass Filter

final high pass

VIII. Hybrid Image

  • Low Frequency Image: Marilyn image
  • High Frequency Image: Einstein image

hybrid image


Boundary Detection

Description

Boundary detection is the task of identifying the edges and boundaries within an image. It is commonly used in image processing to delineate objects or regions within a scene.

I. Edge Detection

Identifying edges within the images.

Case 1

Parameters:

  • Sigma = 1

sigma1

  • Sigma = 50

sigma50 again

Observation

For higher sigma value, the image becomes more blurred (more smoothness) and loss of details.

Case 2

Parameters:

  • T_low = 0.05
  • T_high = 0.09

lenda1

  • T_low = 0.1
  • T_high = 0.3

lenda2

Observation

  • For lower values T_low and T_high: The weak edges appear clearly.
  • For higher values T_low and T_high: The strongest edges appear clearly and most of the details are lost.

I. Shape Detection

   1. Line Detection

Case 1

Parameters:

  • Number of Lines = 4
  • Neighbor Size = 10

Observation: The line detection algorithm was applied with a small number of peaks = 4, resulting in a reduced number of detected lines compared to the original lines in the image.

num of lines4

  • Number of Lines = 8
  • Neighbor Size = 10

Observation: The line detection algorithm was applied with a moderate number of peaks = 8, resulting in a balanced representation of lines in the image.

8lines

  • Number of Lines = 27
  • Neighbor Size = 10

Observation: The line detection algorithm was applied with a large number of peaks = 27, resulting in the detection of one line being represented as multiple lines, leading to a dense cluster of lines plotted over it.

27lines

Case 2

Parameters:

  • Neighbor Size = 1
  • Number of Lines = 7

Observation: Using a small neighboring size = 1, results in detected lines that are closely aligned with the original ones in the image.

1neighbor

  • Neighbor Size = 35
  • Number of Lines = 7

Observation: Using a large neighboring size = 35, results in a more generalized representation of detected lines compared to the original ones in the image.

35neigh

   2. Circle Detection

Case 1: Narrow Range

  • Min Radius = 7
  • Max Radius = 50
  • Bin Threshold = 0.6
  • Pixel Threshold = 20

circle1

Case 2: Wide Range

  • Min Radius = 64
  • Max Radius = 160
  • Bin Threshold = 0.6
  • Pixel Threshold = 20

wide circle

  • Min Radius = 17
  • Max Radius = 100
  • Bin Threshold = 0.6
  • Pixel Threshold = 20

circle case2

Case 3: Bin Threshold

  • Min Radius = 7
  • Max Radius = 81
  • Bin Threshold = 1.0
  • Pixel Threshold = 100

case3

   3. Active Contour Model (Snakes)

It is a powerful technique used in image processing for tasks such as object tracking and segmentation. The model works by iteratively adjusting a contour to fit the edges in an image, based on an energy minimization process.

Chain code

The chain code is a compact representation used in image processing and computer vision to describe the shape of a contour. It simplifies the contour by encoding the direction of transitions between adjacent points along the contour.

The perimeter of the contour

It refers to the total length of its boundary. It represents the distance around the shape.

The area of the contour

It represents the total surface area enclosed by the contour.

Parameters

  • Square Contour
  • Alpha = 3
  • Beta = 96
  • Gamma = 100
  • Iterations = 95

snake1

Chain code 8: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 7, 7, 7, 7, 7, 0, 7, 0, 7, 0, 1, 7, 0, 0, 7, 0, 0, 0, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 5, 4, 5, 4, 4, 5, 4, 4, 5, 4, 3, 5, 4, 4, 5, 4, 4, 4, 4, 3, 7, 0, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

Chain code 4: [0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 2, 1, 1, 1, 1, 1]

Contour perimeter: 625.86

Contour area: 28925.50 square units


  • Square Contour
  • Alpha = 3
  • Beta = 96
  • Gamma = 92
  • Iterations = 95

snake2

Chain code 8: [2, 1, 5, 1, 1, 1, 1, 1, 1, 7, 7, 7, 0, 1, 1, 1, 7, 7, 7, 7, 7, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 3, 2, 3, 7, 7, 7, 6, 6, 6, 6, 6, 6, 6, 6, 7, 5, 6, 6, 6, 7, 6, 6, 5, 6, 6, 5, 5, 5, 5, 5, 4, 4, 4, 0, 4, 0, 4, 4, 4, 4, 4, 0, 4, 4, 0, 0, 4, 4, 0, 0, 4, 4, 0, 0, 4, 4, 4, 4, 4, 0, 0, 4, 4, 0, 0, 4, 4, 0, 0, 4, 4, 0, 0, 4, 4, 0, 0, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 3, 6, 1, 2, 1, 1, 1, 2, 1, 3, 2, 1, 2, 1, 2, 1, 1, 1]

Chain code 4: [1, 0, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 0, 2, 0, 2, 2, 2, 2, 2, 0, 2, 2, 0, 0, 2, 2, 0, 0, 2, 2, 0, 0, 2, 2, 2, 2, 2, 0, 0, 2, 2, 0, 0, 2, 2, 0, 0, 2, 2, 0, 0, 2, 2, 0, 0, 2, 2, 2, 2, 2, 1, 1, 1, 3, 1, 1, 1, 1, 1]

Contour perimeter: 665.76

Contour area: 22442.50 square units


  • Circle Contour
  • Alpha = 3
  • Beta = 96
  • Gamma = 92
  • Iterations = 95

snake3

Chain code 8: [6, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 6, 2]

Chain code 4: [3, 2, 2, 2, 1, 1, 0, 3, 3, 1]

Contour perimeter: 663.46

Contour area: 33908.00 square units


Features Detection and Image Matching

Feature detection involves identifying key points in images that are used to match similar images or track objects.

I. Feature Extraction

   1. Harris Operator

Parameters:

  • Very Low Threshold = 0.1

Observation:

The algorithm detects a large number of corners, resulting in an over-detection scenario.

0 1threshold

  • High Threshold = 0.8

Observation:

Applying a high threshold value of 0.8 to the Harris response significantly reduces the number of detected corners to 103, with only a sparse set identified in regions with pronounced intensity variations.

0 8threshold

  • Medium Threshold = 0.4

Observation:

A balanced distribution of corners is identified, covering both prominent features and subtle intensity variations.

medthreshold

   2. Lambda Minus

Parameters:

  • Threshold = 0.04
  • block_size = 2
  • k_size = 3

lambda1

Computation time: 0.004015207290649414

lambda2 Computation time: 0.03145337104797363

  • Threshold = 0.04
  • block_size = 2
  • k_size = 5

lambda3 Computation time: 0.004987001419067383

  • Threshold = 0.04
  • block_size = 2
  • k_size = 7

lambda5 Computation time: 0.00398564338684082


  • Threshold = 0.04
  • block_size = 3
  • k_size = 7

lambda6 Computation time: 0.0030677318572998047

Observation:

  • If you increase the blockSize, the window of the neighborhood considered for corner detection becomes larger.
  • If you increase the ksize, the Sobel operator becomes more sensitive to larger, more prominent edges and less sensitive to smaller, finer details. This could result in fewer corners being detected, but those detected may be more robust and significant.

II. Feature Descriptors Generation

  • The Scale-Invariant Feature Transform (SIFT)

It is a powerful method for detecting and describing local features in images. It is widely used in computer vision tasks such as object recognition, image stitching, and 3D reconstruction. SIFT features are invariant to image scale, and rotation, and partially invariant to changes in illumination and viewpoint.

f

Observation:

What we see in the above images are are the Keypoints with their orientation.

Computation_time: 129.2920205593109 sec

III. Feature Matching

   1. Using KNN

maching using KNN

   2. Using NCC

NCC

   3. Using SSD

SSD

   4. Detecting Objects Using NCC

detecting objects using NCC


Image Segmentation

Image segmentation is the process of partitioning an image into multiple segments to simplify its analysis. It is commonly used to isolate objects or regions of interest in an image.

I. Optimal Thresholding

   1. Global

global

   1. Local

Parameters:

  • block_size = 20

Observation:

In regions with intricate details or sharp-intensity transitions, smaller blocks can capture nuances more effectively.

image

  • block_size = 50

Observation:

With a moderate block size, a balance between local analysis and computational efficiency is achieved.

The segmentation results tend to be smoother compared to smaller block sizes due to averaging effects over larger regions.

image

  • block_size = 80

Observation:

Larger block sizes lead to more global analysis of pixel intensities, encompassing broader regions within the image.

image

II. Otsu Thresholding

   1. Global

image

   2. Local

Parameters

  • block_size = 50

image

  • block_size = 80

image

III. Spectral Thresholding

   1. Global

Parameters:

  • number of classes = 2

image

  • number of classes = 3

image

  • Increase the number of classes more and more

image

Conclusion:

With each increment in the number of classes, the algorithm will compute an additional threshold value. These thresholds aim to segment the image into more distinct intensity levels or regions.

And because the image has almost three peaks, if the chosen number of classes equals 3 the algorithm will allow for finer segmentation.

And if we increase it more and more may lead to wasted computational resources and potentially noisy segmentation results, It's generally advisable to choose the number of classes based on the information present in the histogram of the image.

   2. Local

Parameters:

  • Smaller Window Size = 25

image

  • Larger Window Size = 70

image

VI. K-means Clustering

Parameters:

  • # Clusters = 2
  • # Iterations = 15

image

  • # Clusters = 5
  • # Iterations = 15

image

V. Region Growing Segmentation

  • Threshold = 4

image

  • Threshold = 50

image

Observation:

As the threshold increases the segmented region is expected to grow progressively larger, incorporating more neighbouring pixels that meet the heightened similarity criterion.

VI. Agglomerative Clustering

  • Clusters = 10
  • Threshold = 15

image

  • Clusters = 5
  • Threshold = 15

image

Observation:

Increasing the number of clusters will result in a greater number of clusters in the output image. Each cluster will represent a distinct color group, potentially leading to finer details but also increasing complexity.

  • Clusters = 10
  • Threshold = 50

image

  • Clusters = 10
  • Threshold = 15

image

Observation:

A higher initial number of clusters leads to finer initial clustering, potentially capturing more subtle color variations in the image. However, it may also increase the computational complexity of the algorithm.

VII. Mean shift Clustering

  • Bandwidth = 60

image

  • Bandwidth = 180

image

  • Bandwidth = 350

image

VIII. RGB to LUV Conversion

image

Face Detection and Recognition

Description

Face detection involves identifying human faces in digital images. Face recognition goes a step further by identifying or verifying individuals based on facial features.