awesome-local-global-descriptor

This is my personal note about local and global descriptor. Trying to make anyone can get in to these fields more easily. If you find anything you want to add, feel free to post on issue or email me.

This repo is also a side product when I was doing the survey of our paper UR2KID. If you find this repo useful, please also consider to cite our paper.

@article{yang2020ur2kid,
  title={UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision},
  author={Yang*, Tsun-Yi; Nguyen*, Duy-Kien; Heijnen, Huub; Balntas, Vassileios},
  journal={arXiv preprint arXiv:2001.07252},
  year={2020}
}

This repo will be constantly updated.

Author: Tsun-Yi Yang (shamangary@hotmail.com)

Local matching pipeline

In this section, I focus on the review about the sparse keypoint matching and it's pipeline.

1. Keypoint detection

This subsection includes the review about keypoint detection and it's orientation, scale, or affine transformation estimation.

Year Paper link Code
[ICCV19] Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters PDF Github
[ECCV18] Repeatability Is Not Enough: Learning Discriminative Affine Regions via Discriminability arXiv Github
[CVPR17] Learning Discriminative and Transformation Covariant Local Feature Detectors PDF Github
[CVPR17] Quad-networks: unsupervised learning to rank for interest point detection PDF -
[CVPR16] Learning to Assign Orientations to Feature Poitns - Github
[CVPR15] TILDE: a Temporally Invariant Learned DEtector arXiv Github
  • 3D
Year Paper link Code
[ICCV19] USIP: Unsupervised Stable Interest Point Detection from 3D Point Clouds arXiv Github
[arXiv19] Self-Supervised 3D Keypoint Learning for Ego-motion Estimation arXiv Github

2. Keypoint description (local descriptor)

In the last few decades, people focus on the patch descriptor

  • Hand-crafted
Year Paper link Code
[CVPR16] Accumulated Stability Voting: A Robust Descriptor from Descriptors of Multiple Scales PDF Github
[CVPR15] Domain-Size Pooling in Local Descriptors: DSP-SIFT PDF -
[CVPR15] BOLD - Binary Online Learned Descriptor For Efficient Image Matching PDF Github
[CVPR13] Boosting binary keypoint descriptors - -
[CVPR12] Freak: Fast retina keypoint - -
[CVPR12] Three things everyone should know to improve object retrieval PDF -
[IPOL11] ASIFT: An Algorithm for Fully Affine Invariant Comparison - -
[ICCV11] BRISK: Binary robust invariant scalable keypoints - -
[ICCV11] Orb: An efficient alternative to sift or surf - -
[ICCV11] Local inten-sity order pattern for feature description - -
[CVIU06] Speeded-up robust features (SURF) - -
[ECCV06] Surf:Speeded up robust features - -
[IJCV04] Distinctive image features from scale-invariant keypoints - Github
  • Deep learning
Year Paper link Code
[TIP19] Learning Local Descriptors by Optimizing the Keypoint-Correspondence Criterion: Applications to Face Matching, Learning from Unlabeled Videos and 3D-Shape Retrieval arXiv Github
[ICCV19] Beyond Cartesian Representations for Local Descriptors PDF -
[CVPR19] SOSNet: Second Order Similarity Regularization for Local Descriptor Learning arXiv,Page Github
[ECCV18] GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints - Github
[CVPR18] Local Descriptors Optimized for Average Precision Page -
[NIPS17] Working hard to know your neighbor's margins: Local descriptor learning loss arXiv Github
[ICCV17] DeepCD: Learning Deep Complementary Descriptors for Patch Representations PDF Github
[CVPR17] L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space PDF Github
[arXiv16] PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors arXiv Github
[BMVC16] Learning local feature descriptors with triplets and shallow convolutional neural networks PDF Github
[ICCV15] Discriminative Learning of Deep Convolutional Feature Point Descriptors Page Github
[CVPR15] MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching PDF -
[CVPR15] Learning to compare image patches via convolutional neural networks PDF Github
  • 3D
Year Paper link Code
[arXiv19] DEEPPOINT3D: LEARNING DISCRIMINATIVE LOCAL DESCRIPTORS USING DEEP METRIC LEARNING ON 3D POINT CLOUDS arXiv -

3. End-to-end matching pipeline

Recently, more and more papers try to embed the whole matching pipeline (keypoint detection, keypoint description) into one framework.

Year Paper link Code
[arXiv20] DISK: Learning local features with policy gradient arXiv -
[arXiv20] D2D: Keypoint Extraction with Describe to Detect Approach arXiv -
[arXiv20] HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning arXiv -
[arXiv20] Learning Feature Descriptors using Camera Pose Supervision arXiv -
[arXiv20] Efficient Neighbourhood Consensus Networks via Submanifold Sparse Convolutions arXiv github
[arXiv20] S2DNet: Learning Accurate Correspondences for Sparse-to-Dense Feature Matching arXiv -
[CVPR20] ASLFeat: Learning Local Features of Accurate Shape and Localization arXiv github,tfmatch
[CVPR20] Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task arXiv -
[NIPS19] R2D2: Repeatable and Reliable Detector and Descriptor arXiv,Page Github
[ICCV19] ELF: Embedded Localisation of Features in Pre-Trained CNN PDF Github
[CVPR19] RF-Net: An End-to-End Image Matching Network based on Receptive Field arXiv Github
[CVPR19] D2-Net: A Trainable CNN for Joint Description and Detection of Local Features arXiv,Page Github
[BMVC19] Matching Features without Descriptors: Implicitly Matched Interest Points PDF github
[CVPRW18] SuperPoint: Self-Supervised Interest Point Detection and Description arXiv Github,3rd_party
[NIPS18] LF-Net: Learning Local Features from Images PDF Github
[ECCV16] LIFT: Learned Invariant Feature Points - Github
  • 3D
Year Paper link Code
[arXiv20] StickyPillars: Robust feature matching on point clouds using Graph Neural Networks arXiv -

3.5. Dense descriptor

Unlike local keypoint descriptor depends on keypoint, some works try to get the whole dense descriptor representation.

Year Paper link Code
[ICRA20] GN-Net: The Gauss-Newton Loss for Multi-Weather Relocalization arXiv, MyNote Web
[ICCV17] CLKN: Cascaded Lucas-Kanade Networks for Image Alignment PDF -

4. Geometric verification or learning based matcher

After the matching, standard RANSAC and it's variants are usually adopted for outlier removal.

  • Algorithm based
Year Paper link Code
[arXiv20] AdaLAM: Revisiting Handcrafted Outlier Detection arXiv github
[arXiv20] Multi-View Optimization of Local Feature Geometry arXiv -
[CVPR19] MAGSAC: Marginalizing Sample Consensus PDF Github
[CVPR16] Progressive Feature Matching with Alternate Descriptor Selection and Correspondence Enrichment PDF -
[CVPR13] Robust Feature Matching with Alternate Hough and Inverted Hough Transforms PDF -
[ECCV12] Improving Image-Based Localization by Active Correspondence Search PDF -
[CVPR05] Matching with PROSAC – Progressive Sample Consensus PDF -
[CVPR05] Two-View Geometry Estimation Unaffected by a Dominant Plane PDF Github
  • Deep learning based
Year Paper link Code
[CVPR20] SuperGlue: Learning Feature Matching with Graph Neural Networks arXiv Github
[CVPR20] High-dimensional Convolutional Networks for Geometric Pattern Recognition arXiv, youtube -
[CVPR20] ACNe: Attentive Context Normalization for Robust Permutation-Equivariant Learning arXiv github
[arXiv20] RANSAC-Flow: generic two-stage image alignment arXiv, youtube page,Github
[ICCV19] NG-RANSAC for Epipolar Geometry from Sparse Correspondences arXiv Github
[ICCV19] Learning Two-View Correspondences and Geometry Using Order-Aware Network arXiv Github
[CVPR18] Learning to Find Good Correspondences - Github
  • Image registration
Year Paper link Code
[arXiv20] Deep Global Registration arXiv, youtube -
[Access18] Multi-Temporal Remote Sensing Image Registration Using Deep Convolutional Features PDF Github

Global retrieval

Consider global retrieval usually targets on a lot of candidates, there are several way to generate one single description for one image.

1. Feature aggregation

  • Hand-crafted

When there is only hand-crafted local descriptors, people usually uses feature aggregation from a set of local descriptors and output a single description.

Year Paper link Code
[ICCV13]
[IJCV15]
To aggregate or not to aggregate: Selective match kernels for image search
Image search with selective match kernels: aggregation across single and multiple images
ICCV
IJCV
Official : matlab, from DELF (tensorflow)
[CVPR13] All about VLAD PDF -
[ECCV10] Improving the fisher kernel for large-scale image classification PDF -
[CVPR07] Object retrieval with large vocabularies and fast spatial matching PDF -
[CVPR06] Fisher kenrels on visual vocabularies for image categorizaton PDF -
  • Deep learning

Similar idea but use deep learning to adapt classical algorithm

Year Paper link Code
[ECCV16] CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples. PDF -
[CVPR16] NetVLAD: CNN architecture for weakly supervised place recognition Page Github

2. Real-valued descriptor

One single representation from the image.

Year Paper link Code
[arXiv20] SOLAR: Second-Order Loss and Attention for Image Retrieval arXiv -
[arXiv20] Unifying Deep Local and Global Features for Efficient Image Search arXiv -
[arXiv19] ACTNET: end-to-end learning of feature activations and multi-stream aggregation for effective instance image retrieval arXiv -
[TIP19] REMAP: Multi-layer entropy-guided pooling of dense CNN features for image retrieval arXiv -
[ICCV19] Learning with Average Precision: Training Image Retrieval with a Listwise Loss arXiv Github
[CVPR19] Detect-to-Retrieve: Efficient Regional Aggregation for Image Search PDF Github
[TPAMI18] Fine-tuning CNN Image Retrieval with No Human Annotation arXiv Github
[IJCV17] End-to-end Learning of Deep Visual Representations for Image Retrieval arXiv Github
[ICCV17] Large-Scale Image Retrieval with Attentive Deep Local Features - Github
[ECCV16] CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples arXiv Github

3. Binary descriptor and quantization

For more compact representation, a binary descriptor can be generated from hashing or thresholding. Quantization is also very popular in large-scale image retrieval.

Year Paper link Code
[ICCVW19] DAME WEB: DynAmic MEan with Whitening Ensemble Binarization for Landmark Retrieval without Human Annotation PDF Github
[CVPR19] FastAP: Deep Metric Learning to Rank PDF Github
[CVPR18] Hashing as Tie-Aware Learning to Rank PDF Github
[AAAI18] Deep Region Hashing for Generic Instance Search from Image - -
[TPAMI18] Supervised Learning of Semantics-Preserving Hash via Deep Convolutional NeuralNetworks - -
[TPAMI13] Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval PDF -
[TPAMI10] Product quantization for nearest neighbor search PDF -

4. Post-processing

Anything can boost the performance in the post-processing stage such as re-ranking or query expansion.

Year Paper link Code
[CVPR19] Local features and visual words emerge in activations PDF -
[CVPR12] Object retrieval and localization with spatially-constrained similarity measure and k-NN re-ranking PDF -

5. 3d point cloud

Year Paper link Code
[CVPR18] PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition arXiv Github

Multi-tasking local and global descriptors

Some works try to cover both local descriptor and global retrieval due to the shared similarity about the activation and the applications.

Year Paper link Code
[arXiv20] UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision arXiv -
[CVPR19] ContextDesc: Local Descriptor Augmentation with Cross-Modality Context - Github
[CVPR19] From Coarse to Fine: Robust Hierarchical Localization at Large Scale with HF-Net arXiv Github
[ICCV17] Large-Scale Image Retrieval with Attentive Deep Local Features (DELF) - Github

Reivew type paper

Year Paper link Code
[arXiv18] From handcrafted to deep local features arXiv -
[CVPR17] Comparative Evaluation of Hand-Crafted and Learned Local Features PDF -

Metric learning

Year Paper link Code
[arXiv20] Metric learning: cross-entropy vs. pairwise losses arXiv -
[arXiv19] A Metric Learning Reality Check arXiv -

MVS

Year Paper link Code
[CVPR20] Fast-MVSNet: Sparse-to-Dense Multi-View Stereo With Learned Propagation and Gauss-Newton Refinement arXiv github
[CVPR20] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks arXiv github

View Synthesis

Year Paper link Code
[arXiv20] Reference Pose Generation for Visual Localization via Learned Features and View Synthesis arXiv -
[CVPR20] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks arXiv github

Benchmarks

Local matching

Year Paper link Code Note
[github] pyslamv2 - github Covering all SOTA detector, descriptor for slam
[arXiv2020] Image Matching across Wide Baselines: From Paper to Practice arXiv github
[CVPR17] HPatches: A benchmark and evaluation of handcrafted and learned local descriptors arXiv Github Hpatches
[TPAMI11] Discriminative learning of local image descriptors Page - UBC/Brown dataset (subsets:Liberty (New York), Notre Dame (Paris) and Half Dome (Yosemite))
[CVPR08] On Benchmarking Camera Calibration and MultiView Stereo for High Resolution Imagery

Global retrieval

Year Paper link Code Note
[CVPR18] Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking Page Github ROxford5k, RParis6k
[CVPR07] Object retrieval with large vocabularies and fast spatial matching Page - Oxford5k
[CVPR08] Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases Page - Paris6k

Localization (both local matching and global retrieval)

Year Paper link Code Note
[ECCV20] Map-based Localization for Autonomous Driving web github1, github2 -
[CVPR18] Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions PDF,Page Github Aachen-day-night, Robotcar, CMU-seasons