ImageRetrievalSystem: A Python repository from ngctnnnn

Multiple techniques to enhance performance in CNN-based Image Retrieval systems

Abstract

Activations on Convolutional Neural Networks (CNNs) served as image descriptors have reached its peak in the field of image retrieval due to their outstanding efficiency and compactness of representation. However, there is a massive need of annotated data and high quality annotation is a significance to achieve reasonable results. Throughout this work, we do fine-tune CNNs for image retrieval system on a collection of unordered images automatically. The selection of the train data could be guided using state-of-the-art retrieval and Structure-from-Motion methods to reconstruct 3D models. We additionally apply a novel trainable Generalized-Mean pooling layer generalizing max and average pooling for a boosting in retrieval performance. And we would conduct our experiments with VGG and ResNet architectures on Oxford5k, Paris6k, ROxford5k and RParis6k benchmarks.

Keywords: Image Retrieval, Convolutional Neural Networks, Deep Learning

Report

🎃🎃 Our full report is shown here
🎃🎃 Our demo video is here

This project is hosted by:

Full name	Role
Tan Ngoc Pham	Leader
An Vo	Member
Dzung Tri Bui	Member

Introduction
Repo structure
Demo
Experimental configuration
Results
References

1. Introduction

Throughout this work, we choose the approach as the unsupervised CNNs fine-tuning for image retrieval. Firstly, we harness SfM information and enforce for both hard unmatched and matched examples for CNNs training. Secondly, we let our architectures learn the whitening through the same training data to avoid the short representations that are the limitations from traditional whitening performance. We choose to use a trainable pooling layer which generalizes existing popular pooling schemes for CNNs and thus both enhances the performance and preserving the same descriptor dimensionality as well, lastly.

2. Repo structure

src: All of our source code
- public
  - css
  - img: assets of our work
  - script/cnnimageretrieval-pytorch: the Python core on handling models and systems lies behind our demo
- resources
  - scss
  - views: Frontend code
- routes
  - index.js: Javascript core to process logical beyond Frontend
- index.js: main js file to route the demo
- package.json
notebook: Log results on running our work
.gitattributes
.gitignore
LICENSE
Procfile
deploy.sh
package.json
requirements.txt
yarn.lock
report.pdf: Our final report on this work

3. Demo

The total time for processing both cropping the uploaded image into the new one and processing the query is 18 seconds on average.

Run demo

Install yarn
Install dependencies:

pip install -r requirements.txt

Run project:

yarn 
yarn start

Reproduce our final results

>>> cd src/public/script/cnnimageretrieval-pytorch 

>>> python3 -m cirtorch.examples.test \
          --gpu-id '0' \
          --network-path 'retrievalSfM120k-resnet101-gem' \
          --datasets 'oxford5k' \ 
          --whitening 'retrieval-SfM-120k' \ 
          --multiscale '[1, 1/2**(1/2), 1/2]'

Screenshot from our demo

4. Experimental configuration

We used pre-trained ResNet101-GeM and VGG16-GeM to perform the fine-tuning. We conduct our experiments using NVIDIA @ RTX 3060 GPU, 16GB RAM with 11th Gen Intel® Core™ i7-11700K @ 3.60GHz×16 CPU and PyTorch framework.

5. Results

Our work is inspired from: CNN Image Retrieval in PyTorch: Training and evaluating CNNs for Image Retrieval in PyTorch

ngctnnnn/ImageRetrievalSystem