Awesome Visual Localization:

A curated list of awesome visual localization resources, inspired by awesome-computer-vision and awesome-visual-localization. Visual localization is the task to estimate the 6 dof pose of an image given a representation of the world created using a set of reference images. The representation can be a 3D reconstruction, a set of images with poses tagged or a deep neural network.

This document might have some errors or missing parts. Feel free to make suggestions or pull request. All contributions are well appreciated.

Main challenges
Benchmark
Challenges
Tutorial
Category
Localization Component
Localization System

Main Challenges

Illumination changes
Dynamic scenes with moving objects
Long-time period with different seasons
Occlusion of the scene by an object or person
Strong viewpoint difference

Benchmark

LONG-TERM VISUAL LOCALIZATION

Challenges

[2022 ECCV] Map-Based Localization for Autonomous Driving
[2021 ICCV] Long-Term Visual Localization under Changing Conditions
[2021 ICCV] Map-Based Localization for Autonomous Driving
[2020 ECCV] Long-Term Visual Localization under Changing Conditions
[2020 ECCV] Map-Based Localization for Autonomous Driving
[2019 CVPR] Long-Term Visual Localization under Changing Conditions

Tutorial

ICCV 2021 Large-Scale Visual Localization

Approach	3D map	Pros	Cons
Structure-based	yes	Perform very well in most scenarios	Challenging in large environments in terms of processing time and memory consumption
Structure-based with image retrieval	yes	Improve speed and robustness for large-scale settings	Quality heavily relies on image retrieval
Scene point regression	yes/no	Very accurate position in small-scale settings	To be improved in large environments
Absolute pose regression	no	Fast pose approximation, can be trained for certain challenges	Low accuracy
Pose interpolation	no	Fast and lightweight	Quality relies heavily on image retrieval and only provides a rough pose
Relative pose estimation	no	Fast and lightweight	Quality relies heavily on image retrieval and, e.g., local feature matches or a DNN used for relative pose estimation

Localization Component

Visual Feature

[2020 CVPR] ASLFeat: Learning Local Features of Accurate Shape and Localization [paper]
[2020 ECCV] Learning Feature Descriptors Using Camera Pose Supervision [paper]
[2019 NeurIPS] R2D2: Reliable and Repeatable Detector and Descriptor [paper]
[2019 CVPR] D2-Net: A Trainable CNN for Joint Description and Detection of Local Features [paper]
[2019 arXiv] From handcrafted to deep local features [paper]
[2018 CVPR] Semantic Visual Localization [paper]
[2018 CVPR] SuperPoint: Self-Supervised Interest Point Detection and Description [paper]
[2017 CVPR] Comparative Evaluation of Hand-Crafted and Learned Local Features [paper]
[2017 ICRA] Semantics-aware visual localization under challenging perceptual conditions [paper]
[2004 IJCV] Distinctive Image Features from Scale-Invariant Keypoints [paper]

Image Retrieval

[2022 arXiv] Investigating the Role of Image Retrieval for Visual Localization -- An exhaustive benchmark [paper]
[2019 ICCV] Learning With Average Precision: Training Image Retrieval With a Listwise Loss [paper]
[2019 TPAMI] Fine-Tuning CNN Image Retrieval with No Human Annotation [paper]
[2017 IJCV] End-to-End Learning of Deep Visual Representations for Image Retrieval [paper]
[2016 CVPR] NetVLAD: CNN Architecture for Weakly Supervised Place Recognition [paper]
[2015 CVPR] 24/7 Place Recognition by View Synthesis [paper]

Feature Match

[2022 arXiv] Is Geometry Enough for Matching in Visual Localization? [paper]
[2021 CVPR] LoFTR: Detector-Free Local Feature Matching with Transformers [paper] [code] [project]
[2020 ECCV] S2DNet : Learning Image Features for Accurate Sparse-to-Dense Matching [paper] [code]
[2020 CVPR] SuperGlue: Learning Feature Matching With Graph Neural Networks [paper]
[2019 3DV] Sparse-to-Dense Hypercolumn Matching for Long-Term Visual Localization [paper] [code]
[2018 ECCV] Semantic Match Consistency for Long-Term Visual Localization [paper]
[2017 TPAMI] Efficient amp; Effective Prioritized Matching for Large-Scale Image-Based Localization [paper]
[2017 ICCV] Efficient Global 2D-3D Matching for Camera Localization in a Large-Scale 3D Map [paper]
[2014 3DV] Matching Features Correctly through Semantic Understanding [paper]
[2008 TPAMI] Optimal Randomized RANSAC [paper]
[1981 CACM] Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography [paper]

Pose Computation

[2022 CVPR] The Probabilistic Normal Epipolar Constraint for Frame-To-Frame Rotation Optimization under Uncertain Feature Positions [paper]
[2020 ECCV] Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization [paper] [code]
[2011 CVPR] A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation [paper]

Structure From Motion

[2016 CVPR] Structure-from-Motion Revisited [paper]
[2013 ICCV] Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion [paper]

Localization System

Structure-based

[2021 CVPR] Back to the Feature: Learning Robust Camera Localization from Pixels to Pose [paper] [code]
[2019 CVPR] Visual Localization by Learning Objects-Of-Interest Dense Match Regression [paper]
[2018 CVPR] InLoc: Indoor Visual Localization with Dense Matching and View Synthesis [paper] [code]
[2011 ICCV] Fast Image-Based Localization using Direct 2D-to-3D Matching [paper]

Structure-based With Image Retrieval

[2022 arXiv] Robust Image Retrieval-based Visual Localization using Kapture [paper] [code]
[2020 ECCV Workshop] Hierarchical Localization with hloc and SuperGlue [slides] [code]
[2019 CVPR] From Coarse to Fine: Robust Hierarchical Localization at Large Scale [paper] [code]

Scene Point Regression

[2021 TPAMI] Visual Camera Re-Localization from RGB and RGB-D Images Using DSAC [paper] [code]
[2020 CVPR] Hierarchical Scene Coordinate Classification and Regression for Visual Localization [paper] [code]
[2019 ICCV] SANet: Scene Agnostic Network for Camera Localization [paper] [code]
[2019 ICCV] Expert Sample Consensus Applied to Camera Re-Localization [paper] [code]
[2018 CVPR] Learning Less is More – 6D Camera Localization via 3D Surface Regression [paper] [code]
[2017 CVPR] DSAC - Differentiable RANSAC for Camera Localization [paper] [code]
[2013 CVPR] Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images [paper]

Absolute Pose Regression

[2018 ICRA] Deep Auxiliary Learning for Visual Localization and Odometry [paper]
[2018 RA-L] VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry [paper]
[2018 CVPR] Geometry-Aware Learning of Maps for Camera Localization [paper] [code]
[2017 CVPR] Image-based localization using LSTMs for structured feature correlation [paper]
[2017 CVPR] Geometric loss functions for camera pose regression with deep learning [paper]
[2015 ICCV] PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization [paper]

Pose Interpolation

[2019 CVPR] Understanding the Limitations of CNN-based Absolute Camera Pose Regression [paper]
[2011 ICCV Workshop] Visual localization by linear combination of image descriptors [paper]

Relative Pose Estimation

[2020 ICRA] To Learn or Not to Learn: Visual Localization from Essential Matrices [paper]
[2019 ICCV] CamNet: Coarse-to-Fine Retrieval for Camera Re-Localization [paper]
[2018 ECCV] RelocNet: Continuous Metric Learning Relocalisation using Neural Nets [paper]
[2017 ICCV Workshop] Camera Relocalization by Computing Pairwise Relative Poses Using Convolutional Neural Network [paper] [code]
[2006 3DPVT] Image Based Localization in Urban Environments [paper]

youkely/awesome-visual-localization