image retrieval demo
This repo is a minimal demo for image retrieval/loop closure detection. It is based on the DBoW3 project.
Prerequisites
The code has been tested on MacOS(Apple silicon) with clang-1316.0.21.2.5. Ubuntu should also work. You need to install the following libraries:
OpenCV 3.4.16 with contrib modules
We use OpenCV to extract keypoints and descriptors. The contrib modules are required. Just follow the official guide to install OpenCV with contrib modules.
Create a temporary directory, which we denote as <cmake_build_dir>, where you want to put the generated Makefiles, project files as well the object files and output binaries and enter there.
For example
cd ~/opencv
mkdir build
cd build
Configuring. Run cmake [<some optional parameters>] <path to the OpenCV source directory>
For example
cmake -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local ..
to build with modules from opencv_contrib set OPENCV_EXTRA_MODULES_PATH to <path to opencv_contrib/modules/>
DBoW3
We use DBoW3 to build the bag-of-words vocabulary, encode images and perform image retrieval.
DBoW3 requires OpenCV only. As described in official guide, we install DBoW3 as follow:
git clone https://github.com/rmsalinas/DBow3.git
cd DBow3
mkdir build
cd build/
cmake ..
make
sudo make install
Build
Clone the repository and use the build script to build the project.
sh build.sh
How to use
There are two main programs in the project: make_voc
and query
.
make_voc
This program builds the bag-of-words vocabulary from a set of images. The vocabulary is saved to a file. The program takes two arguments: the path to a text file containing the paths of images and the path to the output voc file.
make_voc <images_txt> <vocabulary_output_file>
query
This program performs image retrieval based on a given vocabulary. It takes Two arguments: the path to the vocabulary file and the path to the database images directory. The program will load all images in that directory to build a database and you can query any image in the database. The program will show the top 4 retrieved images and their scores(the top 1 is the query image itself and the score is 1). You can input exit
to quit the program.
query <vocabulary_file> <database_dir>
data
We test the program on KITTI dataset. The KITTI Vision Benchmark Suite is a repository of real-world data for autonomous driving created by Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago.
We provide a small vocabulary trained on 10 sequences of KITTI dataset named KITTI_voc.yml.gz
in data/
. You can use it to test the program. There are also a large vocabulary orbvoc.dbow3
provided by DBoW3.
We also use one_hot_gen
to transform the images in sequence 02 and 05 of KITTI dataset to one-hot encoding using the vocabulary KITTI_voc.yml.gz
. The one-hot encoding images are saved in data/02.txt
and data/05.txt
. Each line in the text file is the image path and the one-hot encoding(10000 dimension) of the image.