This repo contains some modules and scripts to classify sets of images that belong together in a panorama.
-
Install Python 3.6+ (https://www.python.org) and
pipenv
$ pip install pipenv
-
Clone this repo
$ git clone https://github.com/alkasm/panorama-classifier.git
-
Change directories into this repo
$ cd panorama-classifier panorama-classifier $
-
Install the necessary packages from the
Pipfile
(opencv-python
andnetworkx
and their dependencies)panorama-classifier $ pipenv install
-
To run any scripts inside this environment, simply use
pipenv
as you normally wouldpanorama-classifier $ pipenv run python classify_hist.py --help
To classify panoramas based on their color histograms, you can use the classify_hist.py
script and provide a folder containing images.
panorama-classifier $ pipenv run python classify_hist.py --help
usage: classify_hist.py [-h] [--thresh THRESH] data
positional arguments:
data directory with panoramic images
optional arguments:
-h, --help show this help message and exit
--thresh THRESH threshold for matching
Here the directory containing images expects there to only be two types of files: image files and an optional .json
file for ground truth data to be compared against the classification. Aside from a .json
file, all other files in the folder are presumed to be able to be read and opened by OpenCV. An optional argument --thresh THRESH
can be passed to the script to override the default threshold of 1.25.
From Wikipedia:
Content-based image retrieval (CBIR), also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR) is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases. ...
"Content-based" means that the search analyzes the contents of the image rather than the metadata such as keywords, tags, or descriptions associated with the image. The term "content" in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself.
The cbir
module contains two main pieces: the image content descriptors, and the database used for image retrieval.
For the descriptors, descriptors.py
defines some interfaces (Features
and Descriptor
) which can be subclassed to be used for the CBIR system, depending on how you want your images described and what distance metric should be used between features. Currently, the module has a HistogramDescriptor
implemented which returns HistogramFeatures
that can be compared with their distance()
method, which uses the chi-squared distance metric to measure the distance between two histograms.
For a reverse image search, the database needs to be initialized with the data and queryable. The FeatureDatabase
inside featuredb.py
is just a simple class to hold an instance of the indexed database in memory, with a query method. Any descriptor following the prototypes can be used here. The databases will be expanded out to allow actual database clients in the future (probably Redis for an in-memory database, SQLite for an on-disk database).
The classifier recognizes panoramas by computing pairwise distances between each pair of image descriptors in the database. This defines a graph where each edge is weighted by the distance between two images. Thresholding this graph on the edge weights will keep only the strongest matches around, and the connected components in this thresholded graph correspond to the individual panoramas in the dataset. This module uses networkx
to define the graph and run connected components on the result.
This classifier is loosely based on Brown and Lowe's 2003 paper Recognising Panoramas in ICCV, without the probablistic machinery used in the paper.
-
R. Szeliski, Image Alignment and Stitching: A Tutorial, Microsoft Research, 2004.
-
M. Brown, D. Lowe, Recognising Panoramas, ICCV, 2003.