/CIRCE

CIRCE: Real-Time Caching for Instance Recognition on Cloud Environments and Multi-Core Architectures.

Primary LanguageC++

CIRCE

Overview

This is the code related of the pape CIRCE: Real-Time Caching for Instance Recognition in Cloud Environments and Multi-Core Architectures. CIRCE performs Instance Recognition tasks in 14ms on a 16-cores architecture with a hit ratio of at least 60% and precision of at least 97% with a speedup around 10 w.r.t. CloudAR.

What is CIRCE?

CIRCE is a Similarity Cache (SC) system for Instance Recognition (IR) tasks performed on a Back End System (BES) and submitted by mobile devices.

Consider the example where we have an application where a mobile device takes a picture of an object instance (e.g., a movie poster at the movie theater), which is sent to the BES (since performing the IR task could be too much time/memory expensive), which identifies the object instance for the input image and returns context about it (e.g., the movie title and its IMDB rating).

CIRCE tries to speed-up similar and frequent IR tasks by caching the input images sent by users and the correspondent instance label generated by the BES: if an input image is similar enough w.r.t. one of the cached images, we have a cache hit and we return the same label associated to the hit image, without querying the BES. To make everything even faster, CIRCE exploits parallel techniques for computing image descriptors and image codes similarity during cache look-ups.

User Manual

Requirements

Conventions

  • $CCROOT: path to CIRCE.
  • $VLFEATROOT: path to local VLFeat copy.
  • $DATASET: image dataset (see below).
  • $GTFILES: ground truth folder (see below).
  • $OUTPUT: path to CIRCE output (binaries of SIFT descriptors, VLAD/Fisher Vector codes, ...).

Datasets

The available datasets for CIRCE up to now are:

  1. Oxford Building Dataset (MAP only). You will need the Paris Building Dataset for training.
  2. Poster Movie dataset.
  3. Painting dataset.
  4. Revisited Oxford Building dataset.

Note: contact the author to have a copy of the last three datasets.

Compiling

CC is compiled with the Intel C++ Compiler for best performance. Remember to source the Compiler env variables by calling in your terminal (or saving in your .bashrc file):

source /opt/intel/compilers_and_libraries_2017.3.191/linux/bin/compilervars.sh intel64 >/dev/null

  1. Variables in $CCROOT/make/makefile needs to be changed according to your configuration. In particular:
    1. LOCAL : OpenCV root directory (e.g. /usr/local/)
    2. VLFEAT : $VLFEATROOT
    3. VLFEATLIB: $VLFEATROOT/bin/$ARCH where $ARCH depends on your architecture (e.g. glnxa64). See here for more details.
    4. If you want to disable LCS, RN or SSR for VLAD, remove the correspondent defined macros in the makefile
  2. Edit $CCROOT/main.cpp and edit the path of each dataDir, trainDir and gtDir for all the datasets that you want to evaluate.
  3. Compile with make in $CCROOT/make.
  4. Remember to add to your LD_LIBRARY_PATH OpenCV and VLFeat libraries.

Note: the dir variables in main.cpp have as prefix std::string homeDir, which is automatically set to your home dir. This could not work on non-Linux system (with a couple of others lines in the project). You can simply delete the homeDir variable in main.cpp and choose your own path for these 3 dirs.

Binaries

Creating descriptors/codes is time expensive. So CIRCE will look $OUTPUT for the relative binaries. Codes.bin are the VLAD codes for the given dataset, DescData.bin are the dataset descriptors, DescTrain.bin are the training descriptors and Trainer.bin are the k-means values.

IMPORTANT: if you change the parameters of a certain algorithm (e.g. nOctaveLayers for SIFT or numCenters in VLAD) you will need to delete the relative .bin file, otherwise CIRCE will try to read it. Remember this!

Run it!

Executing ./CloudCache will use OpenCV SIFT as descriptor (with default values) and VLAD (with 16 centroids, descriptors in 128 dimensions (SIFT descriptors) and normalized components).

There are essentially 3 available detector/descriptors:

  1. SIFT by OpenCV
  2. SURF by OpenCV
  3. (Parallel) Hessian Affine + SIFT descriptor (aka PHA)

There are 2 encoders:

  1. VLAD encoder with LCS, RN and SSR.
  2. Fisher Vectors by VLFeat.

This is CIRCE syntax:

./CloudCache resizeDim $DIM OMP 0 [$ENCODER] [$DESCRIPTOR] sampled 0 dataset $DATASETNAME

Where:

  • $DIM: each image is reasized to this value (0 use the original image). Up to now setting it to 500 gave the best results.
  • $DESCRIPTOR: `PHA|SIFTOpenCV 0 3 0.04 10 1.6|SURFOpenCV 400 4 3 0 1"
    • Default: SIFTOpenCV 0 3 0.04 10 1.6.
    • The values for SIFTOpenCV and SURFOpenCV are default constructor values by the correspondent OpenCV classes, feel free to change these values.
  • $ENCODER: VLAD $NUMCENTERS $DESCDIM "VL_VLAD_FLAG_SQUARE_ROOT|VL_VLAD_FLAG_NORMALIZE_COMPONENTS" | FisherVector $NUMCENTERS $DESCDIM "VL_FISHER_FLAG_IMPROVED"
    • $NUMCENTERS: number of encoder centers (usually 64, 128, 256 or 512)
    • $DESCDIM': descriptor dimension (64 for SURFOpenCV, 128 for SIFTOpenCVandPHA`).
    • Note on VLAD: to disable RN (Residual Normalization), LCS (Local Coordinate System) or SSR (Signed Square Rooting), delete -DRN -DLCS -DSSR from $CCROOT/make/makefile, respectively.
  • $DATASETNAME: painting|posters|oxfordForCache

Examples:

./CloudCache resizeDim 500 OMP 0 FisherVector 64 128 "VL_FISHER_FLAG_IMPROVED" PHA sampled 0 dataset painting

  • Images are resized so the largest side is 500pxs, keeping the original ratio.
  • Fisher vector with 64 centers and descriptor in 128 dimensions
  • Parallel Hessian Affine + SIFT descriptor
  • Uffizi painting dataset

./CloudCache resizeDim 0 OMP 0 VLAD 256 64 "VL_VLAD_FLAG_SQUARE_ROOT|VL_VLAD_FLAG_NORMALIZE_COMPONENTS" SURFOpenCV 400 4 3 0 1 sampled 0 dataset posters

  • Use the original size (time and memory expensive)
  • VLAD code with 256 centers and descriptor in 64 dimensions
  • SURF with OpenCV default values
  • Movie poster dataset

Output statistics

At the end of each CIRCE's execution, a new line to $OUTPUT/stats.ods will be added (automatically created if it doesn't exist) containing statistics such as the descriptor used, encoder, dataset, MAP, MRR and timings.