/CasHash_CUDA

πŸŒ‡ GPU acclearated SIFT feature matching between images

Primary LanguageC++

CasHash-CUDA

Description

This project provides a library for GPU acclearated SIFT feature matching between images.

How fast is it?

According to our benchmark on Tesla K20 GPU, this algorithm can reach 30 fps pairing 30 images each with ~2000 sift vectors at a time.

What for?

This program can be used as a frontend for online image matching as well as large scale 3D reconstruction.

Related Publication

Cheng Jian, Cong Leng, Jiaxiang Wu, Hainan Cui, and Hanqing Lu. β€œFast and accurate image matching with cascade hashing for 3d reconstruction.” In IEEE Conference on Computer Vision and Pattern Recognition (CVPR2014), pp. 1-8. 2014.

Installation

git clone git@github.com:cvcore/cashash_cuda.git cashash_cuda
git submodule init
cd cashash_cuda
mkdir build && cd build
cmake ..
make

Usage

Input
A list of path storing SIFT keyfeatures extracted from the images.
Output
Match pairs.

Sole command:

./KeyMatchCUDA <list.txt> <outfile>

On Lichtweise Cluster:

Extract dataset file into cashash_cuda/dataset, then in build folder, run:

sbatch job.sh

You can download the dataset here:

https://www.dropbox.com/s/ur6l6oigyxfzgrp/cashash_cuda_dataset.zip?dl=0

Todo

  • βœ… SIFT Vector Preprocessing & CPU Storage
    • βœ… Load vectors in all images.
      • βœ… Stream loading with cuda stream and asynchronious functions.
      • Device supports concurrent kernel execution & has 2 async engines
    • βœ… Update all SIFT Vectors to become zero mean
      • Stream preprocessing
      • 1000 images * 2000 sift vectors * 128 dim * 4 byte = 976MiB (We have two GPUs of 5GiB global memory in cluster)
  • βœ… Hash Calculation
    • βœ… Hash Remapping
      • For remapping into 128d Hamming space, we use 1x128 grids.
    • βœ… Bucket Generating
      • For bucketing, we use 6x8 grids.
    • βœ… Bucket Storage
      • Bucket Information: 6 bucket group * 2000 vectors * 1000 images * 2 byte = ~24MiB
      • Remapped vector: 2000 vectors * 1000 images * 16 byte = ~31MiB
  • βœ… Matching
    • βœ… Use __device__ int __popcll(unsigned long long int x) for sorting mapped hash values
    • βœ… Query all vectors according to bucket information stored in previous step
    • ❌ Check multiple image pairs simultaneously