Bag of Visual Words
This repository contains my solution to the final project on the modern C++ course made by Cyrill Stachniss and Ignacio Vizzo at the University of Bonn. They own all the credits for the problems formulation and material.
Thanks to them and to the University for posting this material and all the other courses online for free.
The instructions on which this project was based can be found on this video. Its development relies heavily on the course homeworks, which solutions can be found on this other repo.
Dependencies
To work and build the files provided in this repo you need to install the following dependencies:
Ubuntu 20.04
To easily install alongisde Windows you can follow this tutorial. The clang
compiler should be preinstalled in this distro, however is probably a good idea to run this script to make sure you have the toolchain suggested by Ignacio Vizzo for the homeworks.
Visual Studio Code (Recommended)
I used VScode to work on this repo. To install it and use the suggested extensions and configuration you can follow these instructions by Ignacio Vizzo.
OpenCV
Its a compile dependency for several homeworks. Its neccessary to have the full version of OpenCV to use the sift
feature extractor. To install OpenCV
the same way as I did you can follow these instructions by Ignacio Vizzo.
fmt
Its a compile dependency for the html
visualization. You can install it using the package manager:
sudo apt-get install libfmt-dev
How to run the code?
To run the code first you need to create a folder named dataset
in which you need to place:
- A folder named
raw_imgs
containing all thepng
files. You can download the UniBonn dataset here. - A folder named
sifts_bin
containing all the precomputed sift descriptors for all the images in theraw_imgs
folder. You can do this with the code on the course's homework_5. - A file named
dictionary.bin
containing the precomputed bow dictionary found applying the k-means algorithm over all the sifts descriptor. You can do this with the code on the course's homework_9
If you meet these requirements you can then create a bin
folder inside the repo. Go into that folder and build the code. To do all this you can type in a terminal:
mkdir -p bin && cd bin && cmake .. && make
Then, while in the bin
folder type:
./place_recognition
The program will generate an html file for each image in the raw images dataset containing the 8 more simalar images among all the others.
Disclaimer: The code can work only using the raw images dataset, however it would need some minor changes and would take a lot computing all the sift descriptors and the dictionary. If you want to do this or you would like to use my dictionary.bin
file and my sifts_bin
folder leave an issue on the repo.
Sample output
The image below shows a row of the an html
file produced by the program.