/multimedia-and-web-databases

This project implements advanced image retrieval techniques for Caltech101 dataset, using algorithms like MDS, SVD, DBScan, and Locality Sensitive Hashing (LSH). It focuses on optimizing retrieval accuracy and efficiency through dimensionality reduction, clustering, classification, and relevance feedback, enhancing multimedia search capabilities.

Primary LanguageJupyter Notebook

Multimedia and Web Databases Project - Phase 3

About the project

The phase 3 of this project is experimenting with clustering, indexing and classfication and relevance feedback models. The tasks in this phase involve the feature models, similarity/distance functions, and latent space extraction algorithms developed in the previous phase.

Code structure

This repository contains three python files and three modules and three main internal packages and three repositories for Data, Input and Outputs:

  • util - This file contains all constants and utility methods which are used across this application
  • storage - used to generate all feature descriptors and find k most similar images, all the input is used to this file using functions
  • cli - entrypoint package for this application. Runs a CLI with all arg subparsers for each task. cli.toml contains static data required for this cli.

Packages:

  • FeatureDescriptor/
    • create_label_vectors and create_label_feature_vectors are the scripts to generate all feature descriptors required for this application
    • color_moments, HOG, resnet_features and resnet_softmax_features contain method to generate respective feature descriptors for a given ID or an external. These methods are used in the application and also in the script
    • similarity_measures contains methods for different similarity measures used in this project
  • DimensionalityReductionTechniques/
    • dimensionality_reduction_techniques - this contains a method which returns a class of DRT after performing the dimensionality reduction based on the input drt.
    • this module in this package contains class to perform different dimensionality reduction techniques - SVD, NNMF, LDA and K Means.
  • Classifiers/
    • Classifiers contains implementation of all classifier needed in this project. MNN, PPR, Decision Tree, SVM, mutliclass SVM
    • there is clustering folder which contain DBscan implementation.
  • Tasks/
    • task_runner: this module is used in cli.py to call each of the task executors based on the CLI input
    • task_[0-5]: these modules executes respective task IDs
    • task_util: This module contains the utility method used across the tasks

Output

After we run the program, all extracted images will be saved in Data folder and all the feature descriptors of images and labels, latent semantics, and task outputs will be saved in Outputs folder. 'Data' and 'Outputs' have been deleted initially as they will be autogenerated once we running the program. Outputs folder containing all generated FD csv/txt files has been moved out of this repository. Query output screenshots also stored in the outer Outputs folder

Hardware/Software Requirements

  • Processor: Minimum 2 GHz
  • Hard Drive: Minimum 3 GB
  • RAM: Minimum 4 GB
  • OS: Windows/Mac OSX
  • Programming language: Python 3.9+

How to run

  • Install all dependencies - pip install requirements.txt

  • Pre-process the images data before running the CLI application. Run create_label_vectors and create_label_feature_vectors scripts to generate all the required feature descriptors in Outputs

  • Use python cli.py --help to know the sub-commands for each task

  • Run python cli.py --i or make run_interactive to run interactive mode of this application

  • Alternatively, you can also run tasks using cli.py without the interactive mode - python cli.py task{task_id} [arguments]