Bag of Visual Words

This repository contains my solution to the final project on the modern C++ course made by Cyrill Stachniss and Ignacio Vizzo at the University of Bonn. They own all the credits for the problems formulation and material.

Thanks to them and to the University for posting this material and all the other courses online for free.

The instructions on which this project was based can be found on this video. Its development relies heavily on the course homeworks, which solutions can be found on this other repo.

Dependencies

To work and build the files provided in this repo you need to install the following dependencies:

Ubuntu 20.04

To easily install alongisde Windows you can follow this tutorial. The clang compiler should be preinstalled in this distro, however is probably a good idea to run this script to make sure you have the toolchain suggested by Ignacio Vizzo for the homeworks.

Visual Studio Code (Recommended)

I used VScode to work on this repo. To install it and use the suggested extensions and configuration you can follow these instructions by Ignacio Vizzo.

OpenCV

Its a compile dependency for several homeworks. Its neccessary to have the full version of OpenCV to use the sift feature extractor. To install OpenCV the same way as I did you can follow these instructions by Ignacio Vizzo.

fmt

Its a compile dependency for the html visualization. You can install it using the package manager:

sudo apt-get install libfmt-dev

How to run the code?

To run the code first you need to create a folder named dataset in which you need to place:

A folder named raw_imgs containing all the png files. You can download the UniBonn dataset here.
A folder named sifts_bin containing all the precomputed sift descriptors for all the images in the raw_imgs folder. You can do this with the code on the course's homework_5.
A file named dictionary.bin containing the precomputed bow dictionary found applying the k-means algorithm over all the sifts descriptor. You can do this with the code on the course's homework_9

If you meet these requirements you can then create a bin folder inside the repo. Go into that folder and build the code. To do all this you can type in a terminal:

mkdir -p bin && cd bin && cmake .. && make

Then, while in the bin folder type:

./place_recognition

The program will generate an html file for each image in the raw images dataset containing the 8 more simalar images among all the others.

Disclaimer: The code can work only using the raw images dataset, however it would need some minor changes and would take a lot computing all the sift descriptors and the dictionary. If you want to do this or you would like to use my dictionary.bin file and my sifts_bin folder leave an issue on the repo.

Sample output

The image below shows a row of the an html file produced by the program.

pepisg/Bag-of-Visual-Words