/content-based-image-retrieval

The goal is to find the best algorithm for content-based image retrieval.

Primary LanguageJupyter NotebookMIT LicenseMIT

Content-based Image Retrieval

Table of contents

  1. Description
  2. Replication
  3. Result
  4. License

Description

Content-based image retrieval (CBIR) is the application of computer vision techniques to the image retrieval problem, which involves searching for digital images in large databases.

Content-based means that the search analyzes the contents of the image rather than metadata such as keywords, tags, or descriptions associated with the image.

CBIR

The process consists of four steps:

  1. Extraction of features from an image database to form a feature database.
  2. Extraction of features from the input image.
  3. Finding the most similar features in the database.
  4. Returning the image associated with the found features.

Purpose

The goal is to determine the most suitable model and distance similarity for finding similar images. To achieve this, we explore three different similarity measurements and five models for feature extraction.

Similarity Measurements

Feature Extraction

The objective is to find the right combination (extraction algorithm & similarity measure) that allows us to obtain relevant answers.

Evaluation

In our exploration, we used the Fashion dataset Apparel available on Kaggle. To evaluate different combinations (model + measurement), we utilized three metrics:

  • Mean Average Precision (MAP) for the system's robustness.
  • Mean Reciprocal Rank (MRR) for the relevance of the first element.
  • Average time per query.

The evaluation formulas are referred to in this Stanford course.

Replication

All the experiments can be reproduced using the Makefile:

  • Create the necessary virtual environment for all tests:
make venv
  • Create a repository for features and the .env file with different paths:
make prepare
  • Create the feature dataset:
make features

Result

Experimental results are presented in the report folder. Consult the PDF file and graphs to analyze the performance of different combinations.

License

This project is licensed under the MIT License.