/sift-flow-gpu

SIFT Flow descriptor implemented on PyTorch

Primary LanguageJupyter NotebookMIT LicenseMIT

sift-flow-gpu

Implementation of the SIFT Flow descriptor [1] on GPU using PyTorch.

This implementation is a port of the original implementation available at https://people.csail.mit.edu/celiu/SIFTflow/.

This code is able to process a batch of images simultaneously for better performance. The most expensive operation when running in GPU mode is the allocation of the space for the descriptors on the GPU. However, this step is only performed when the shape of the input batch changes. Subsequent calls using batches with the same shape as before will reuse the memory and will, therefore, be much faster.

Code for DAISY descriptors on GPU is also available at https://github.com/hmorimitsu/daisy-gpu.

Requirements

Usage

A simple example is shown below. A more complete practical usage is available as a Jupyter demo notebook

from sift_flow_torch import SiftFlowTorch

sift_flow = SiftFlowTorch()
imgs = [
    read_some_image,
    read_another_image
]
descs = sift_flow.extract_descriptor(imgs) # This first call can be
                                           # slower, due to memory allocation
imgs2 = [
    read_yet_another_image,
    read_even_one_more_image
]
descs2 = sift_flow.extract_descriptor(imgs2) # Subsequent calls are faster,
                                             # if images retain same shape

# descs[0] is the descriptor of imgs[0] and so on.

Benchmark

  • Machine configuration:
    • Intel i7 8750H
    • NVIDIA GeForce GTX1070
    • Images 1024 x 436
    • Descriptor size 128
Batch Size FP16 Memory usage(GB)1 Time GPU(ms)2 Time GPU(ms)3 Time CPU(ms)
1 0.9 19.0 128.0 660.6
2 1.3 35.3 257.1 1275.1
4 2.1 70.7 516.2 2559.3
8 3.7 142.5 969.4 5773.9
1 ✔️ 0.7 14.7
2 ✔️ 0.9 27.2
4 ✔️ 1.3 54.8
8 ✔️ 2.1 110.9

1 Maximum value reported by nvidia-smi during the respective tests.

2 NOT including time to transfer the result from GPU to CPU.

3 Including time to transfer the result from GPU to CPU.

These times are the median of 5 runs measured after a warm up run to allocate the descriptor space in memory (read the introduction).

References

[1] C. Liu; Jenny Yuen; Antonio Torralba. "SIFT Flow: Dense correspondence across scenes and its applications." IEEE Transactions on Pattern Analysis and Machine Intelligence 33.5 (2010): 978-994.