A curated list of papers & resources linked to 3D reconstruction from images.
Note that:
- This list is not exhaustive,
- Tables use alphabetical order for fairness.
If you look to a more generic computer vision awesome list please check this list
Micro Flying Robots: from Active Vision to Event-based Vision D. Scaramuzza.
ICRA 2016 Aerial Robotics - (Visual odometry) D. Scaramuzza
Simultaneous Localization And Mapping: Present, Future, and the Robust-Perception Age. C. Cadena, L. Carlone, H. Carrillo, Y. Latif, D. Scaramuzza, J. Neira, I. D. Reid, J. J. Leonard.
- "The paper summarizes the outcome of the workshop “The Problem of Mobile Sensors: Setting future goals and indicators of progress for SLAM” held during the Robotics: Science and System (RSS) conference (Rome, July 2015)."
Visual Odometry: Part I - The First 30 Years and Fundamentals, D. Scaramuzza and F. Fraundorfer, IEEE Robotics and Automation Magazine, Volume 18, issue 4, 2011
Visual Odometry: Part II - Matching, robustness, optimization, and applications, F. Fraundorfer and D. Scaramuzza, IEEE Robotics and Automation Magazine, Volume 19, issue 2, 2012
Large-scale, real-time visual-inertial localization revisited S. Lynen, B. Zeisl, D. Aiger, M. Bosse, J. Hesch, M. Pollefeys, R. Siegwart and T. Sattler. Arxiv 2019.
Open Source Structure-from-Motion. M. Leotta, S. Agarwal, F. Dellaert, P. Moulon, V. Rabaud. CVPR 2015 Tutorial.
Large-scale 3D Reconstruction from Images. T. Shen, J. Wang, T.Fang, L. Quan. ACCV 2016 Tutorial.
Multi-View Stereo: A Tutorial. Y. Furukawa, C. Hernández. Foundations and Trends® in Computer Graphics and Vision, 2015.
State of the Art 3D Reconstruction Techniques N. Snavely, Y. Furukawa, CVPR 2014 tutorial slides. Introduction MVS with priors - Large scale MVS
3D indoor scene modeling from RGB-D data: a survey K. Chen, YK. Lai and SM. Hu. Computational Visual Media 2015.
State of the Art on 3D Reconstruction with RGB-D Cameras K. Hildebrandt and C. Theobalt EUROGRAPHICS 2018.
Introduction of Visual SLAM, Structure from Motion and Multiple View Stereo. Yu Huang 2014.
Computer Vision: Algorithms and Applications. R. Szeliski. 2010.
Real-time simultaneous localisation and mapping with a single camera. A. J. Davison. ICCV 2003.
Visual odometry. D. Nister, O. Naroditsky, and J. Bergen. CVPR 2004.
Real time localization and 3d reconstruction. E. Mouragnon, M. Lhuillier, M. Dhome, F. Dekeyser, and P. Sayd. CVPR 2006.
Parallel Tracking and Mapping for Small AR Workspaces. G. Klein, D. Murray. ISMAR 2007.
Real-Time 6-DOF Monocular Visual SLAM in a Large-scale Environments. H. Lim, J. Lim, H. Jin Kim. ICRA 2014.
Direct Sparse Odometry, J. Engel, V. Koltun, D. Cremers, arXiv:1607.02565, 2016.
Visual SLAM algorithms: a survey from 2010 to 2016, T. Taketomi, H. Uchiyama, S. Ikeda, IPSJ T Comput Vis Appl 2017.
∇SLAM: Dense SLAM meets Automatic Differentiation. K. M. Jatavallabhula, G. Iyer, L. Paull. arXiv:1910.10672, 2019.
Direct Sparse Mapping J. Zubizarreta, I. Aguinaga and J. M. M. Montiel. arXiv:1904.06577, 2019.
OpenVSLAM: A Versatile Visual SLAM Framework Sumikura, Shinya and Shibuya, Mikiya and Sakurada, Ken. In Proceedings of the 27th ACM International Conference on Multimedia 2019
Photo Tourism: Exploring Photo Collections in 3D. N. Snavely, S. M. Seitz, and R. Szeliski. SIGGRAPH 2006.
Towards linear-time incremental structure from motion. C. Wu. 3DV 2013.
Structure-from-Motion Revisited. Schöenberger, Frahm. CVPR 2016.
Combining two-view constraints for motion estimation V. M. Govindu. CVPR, 2001.
Lie-algebraic averaging for globally consistent motion estimation. V. M. Govindu. CVPR, 2004.
Robust rotation and translation estimation in multiview reconstruction. D. Martinec and T. Pajdla. CVPR, 2007.
Non-sequential structure from motion. O. Enqvist, F. Kahl, and C. Olsson. ICCV OMNIVIS Workshops 2011.
Global motion estimation from point matches. M. Arie-Nachimson, S. Z. Kovalsky, I. KemelmacherShlizerman, A. Singer, and R. Basri. 3DIMPVT 2012.
Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion. P. Moulon, P. Monasse and R. Marlet. ICCV 2013.
A Global Linear Method for Camera Pose Registration. N. Jiang, Z. Cui, P. Tan. ICCV 2013.
Global Structure-from-Motion by Similarity Averaging. Z. Cui, P. Tan. ICCV 2015.
Linear Global Translation Estimation from Feature Tracks Z. Cui, N. Jiang, C. Tang, P. Tan, BMVC 2015.
Structure-and-Motion Pipeline on a Hierarchical Cluster Tree. A. M.Farenzena, A.Fusiello, R. Gherardi. Workshop on 3-D Digital Imaging and Modeling, 2009.
Randomized Structure from Motion Based on Atomic 3D Models from Camera Triplets. M. Havlena, A. Torii, J. Knopp, and T. Pajdla. CVPR 2009.
Efficient Structure from Motion by Graph Optimization. M. Havlena, A. Torii, and T. Pajdla. ECCV 2010.
Hierarchical structure-and-motion recovery from uncalibrated images. Toldo, R., Gherardi, R., Farenzena, M. and Fusiello, A.. CVIU 2015.
Parallel Structure from Motion from Local Increment to Global Averaging. S. Zhu, T. Shen, L. Zhou, R. Zhang, J. Wang, T. Fang, L. Quan. arXiv 2017.
Multistage SFM : Revisiting Incremental Structure from Motion. R. Shah, A. Deshpande, P. J. Narayanan. 3DV 2014. -> Multistage SFM: A Coarse-to-Fine Approach for 3D Reconstruction, arXiv 2016.
HSfM: Hybrid Structure-from-Motion. H. Cui, X. Gao, S. Shen and Z. Hu, ICCV 2017.
Robust Structure from Motion in the Presence of Outliers and Missing Data. G. Wang, J. S. Zelek, J. Wu, R. Bajcsy. 2016.
Skeletal graphs for efficient structure from motion. N. Snavely, S. Seitz, R. Szeliski. CVPR 2008
Optimizing the Viewing Graph for Structure-from-Motion. C. Sweeney, T. Sattler, M. Turk, T. Hollerer, M. Pollefeys. ICCV 2015
Graph-Based Consistent Matching for Structure-from-Motion. T. Shen, S. Zhu, T. Fang, R. Zhang, L. Quan. ECCV 2016.
Unordered feature tracking made fast and easy. P. Moulon and P. Monasse. CVMP 2012.
Point Track Creation in Unordered Image Collections Using Gomory-Hu Trees. Svärm, Simayijiang, Enqvist, Olsson. ICPR 2012.
Fast connected components computation in large graphs by vertex pruning. A. Lulli, E. Carlini, P. Dazzi, C. Lucchese, and L. Ricci. IEEE Transactions on Parallel and Distributed Systems 2016.
Video Google: A Text Retrieval Approach to Object Matching in Video. J. Sivic, F. Schaffalitzky and A. Zisserman. ICCV 2003.
Scalable Recognition with a Vocabulary Tree. Nister, Stewenius, CVPR 2006.
Building Rome in a Day. S. Agarwal, N. Snavely, I. Simon, S. M. Seitz, R. Szeliski. ICCV 2009.
Product quantization for nearest neighbor search. H. Jégou, M. Douze and C. Schmid. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011.
Fast and Accurate Image Matching with Cascade Hashing for 3D Reconstruction. J. Cheng, C. Leng, J. Wu, H. Cui, H. Lu. CVPR 2014.
Recent developments in large-scale tie-point matching. Hartmann, Havlena, Schindler. ISPRS 2016.
Graphmatch: Efficient Large-Scale Graph Construction for Structure from Motion. C. Qiaodong, V. Fragoso, C. Sweeney and P. Sen. 3DV 2017.
Real-time Image-based 6-DOF Localization in Large-Scale Environments. Lim, Sinha, Cohen, Uyttendaele. CVPR 2012.
Get Out of My Lab: Large-scale, Real-Time Visual-Inertial Localization. Lynen, Sattler, Bosse, Hesch, Pollefeys, Siegwart. RSS 2015.
DSAC - Differentiable RANSAC for Camera Localization. E. Brachmann, A. Krull, S. Nowozin, J. Shotton, F. Michel, S. Gumhold, C. Rother. CVPR 2017.
Learning Less is More - 6D Camera Localization via 3D Surface Regression. E. Brachmann, C. Rother. Submitted to CVPR 2018.
Accurate, Dense, and Robust Multiview Stereopsis. Y. Furukawa, J. Ponce. CVPR 2007. PAMI 2010
State of the art in high density image matching. F. Remondino, M.G. Spera, E. Nocerino, F. Menna, F. Nex . The Photogrammetric Record 29(146), 2014.
Progressive prioritized multi-view stereo. A. Locher, M. Perdoch and L. Van Gool. CVPR 2016.
Pixelwise View Selection for Unstructured Multi-View Stereo. J. L. Schönberger, E. Zheng, M. Pollefeys, J.-M. Frahm. ECCV 2016.
TAPA-MVS: Textureless-Aware PAtchMatch Multi-View Stereo. A. Romanoni, M. Matteucci. ICCV 2019
Efficient Multi-View Reconstruction of Large-Scale Scenes using Interest Points, Delaunay Triangulation and Graph Cuts. P. Labatut, J-P. Pons, R. Keriven. ICCV 2007
Multi-View Stereo via Graph Cuts on the Dual of an Adaptive Tetrahedral Mesh. S. N. Sinha, P. Mordohai and M. Pollefeys. ICCV 2007.
Towards high-resolution large-scale multi-view stereo. H.-H. Vu, P. Labatut, J.-P. Pons, R. Keriven. CVPR 2009.
Refinement of Surface Mesh for Accurate Multi-View Reconstruction. R. Tylecek and R. Sara. IJVR 2010.
High Accuracy and Visibility-Consistent Dense Multiview Stereo. H.-H. Vu, P. Labatut, J.-P. Pons, R. Keriven. Pami 2012.
Exploiting Visibility Information in Surface Reconstruction to Preserve Weakly Supported Surfaces M. Jancosek et al. 2014.
A New Variational Framework for Multiview Surface Reconstruction. B. Semerjian. ECCV 2014.
Photometric Bundle Adjustment for Dense Multi-View 3D Modeling. A. Delaunoy, M. Pollefeys. CVPR2014.
Global, Dense Multiscale Reconstruction for a Billion Points. B. Ummenhofer, T. Brox. ICCV 2015.
Efficient Multi-view Surface Refinement with Adaptive Resolution Control. S. Li, S. Yu Siu, T. Fang, L. Quan. ECCV 2016.
Multi-View Inverse Rendering under Arbitrary Illumination and Albedo, K. Kim, A. Torii, M. Okutomi, ECCV2016.
Shading-aware Multi-view Stereo, F. Langguth and K. Sunkavalli and S. Hadap and M. Goesele, ECCV 2016.
Scalable Surface Reconstruction from Point Clouds with Extreme Scale and Density Diversity, C. Mostegel, R. Prettenthaler, F. Fraundorfer and H. Bischof. CVPR 2017.
Multi-View Stereo with Single-View Semantic Mesh Refinement, A. Romanoni, M. Ciccone, F. Visin, M. Matteucci. ICCVW 2017
Matchnet: Unifying feature and metric learning for patch-based matching, X. Han, Thomas Leung, Y. Jia, R. Sukthankar, A. C. Berg. CVPR 2015.
Stereo matching by training a convolutional neural network to compare image patches, J., Zbontar, and Y. LeCun. JMLR 2016.
Efficient deep learning for stereo matching, W. Luo, A. G. Schwing, R. Urtasun. CVPR 2016.
Learning a multi-view stereo machine, A. Kar, C. Häne, J. Malik. NIPS 2017.
Learned multi-patch similarity, W. Hartmann, S. Galliani, M. Havlena, L. V. Gool, K. Schindler.I CCV 2017.
Surfacenet: An end-to-end 3d neural network for multiview stereopsis, Ji, M., Gall, J., Zheng, H., Liu, Y., Fang, L. ICCV2017.
DeepMVS: Learning Multi-View Stereopsis, Huang, P. and Matzen, K. and Kopf, J. and Ahuja, N. and Huang, J. CVPR 2018.
RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials, D. Paschalidou and A. O. Ulusoy and C. Schmitt and L. Gool and A. Geiger. CVPR 2018.
MVSNet: Depth Inference for Unstructured Multi-view Stereo, Y. Yao, Z. Luo, S. Li, T. Fang, L. Quan. ECCV 2018.
Learning Unsupervised Multi-View Stereopsis via Robust Photometric Consistency, T. Khot, S. Agrawal, S. Tulsiani, C. Mertz, S. Lucey, M. Hebert. 2019.
DPSNET: END-TO-END DEEP PLANE SWEEP STEREO, Sunghoon Im, Hae-Gon Jeon, Stephen Lin, In So Kweon. 2019.
Point-based Multi-view Stereo Network, Rui Chen, Songfang Han, Jing Xu, Hao Su. ICCV 2019.
Seamless image-based texture atlases using multi-band blending. C. Allène, J-P. Pons and R. Keriven. ICPR 2008.
Let There Be Color! - Large-Scale Texturing of 3D Reconstructions. M. Waechter, N. Moehrle, M. Goesele. ECCV 2014.
Submodular Trajectory Optimization for Aerial 3D Scanning. M. Roberts, A. Truong, D. Dey, S. Sinha, A. Kapoor, N. Joshi, P. Hanrahan. 2017.
Project | Language | License |
---|---|---|
Bundler | C++ | GNU General Public License - contamination |
Colmap | C++ | BSD 3-clause license - Permissive |
MAP-Tk | C++ | BSD 3-Clause license - Permissive |
MicMac | C++ | CeCILL-B |
MVE | C++ | BSD 3-Clause license + parts under the GPL 3 license |
OpenMVG | C++ | MPL2 - Permissive |
OpenSfM | Python | Simplified BSD license - Permissive |
TheiaSfM | C++ | New BSD license - Permissive |
Project | Language | License |
---|---|---|
OpenGV | C++ | BSD - permissive |
Project | Language | License |
---|---|---|
Colmap | C++ CUDA | BSD 3-clause license - Permissive (Can use CGAL -> GNU General Public License - contamination) |
GPUIma + fusibile | C++ CUDA | GNU General Public License - contamination |
HPMVS | C++ | GNU General Public License - contamination |
MICMAC | C++ | CeCILL-B |
MVE | C++ | BSD 3-Clause license + parts under the GPL 3 license |
OpenMVS | C++ (CUDA optional) | AGPL3 |
PMVS | C++ CUDA | GNU General Public License - contamination |
SMVS Shading-aware Multi-view Stereo | C++ | BSD-3-Clause license |
Project | Language | License |
---|---|---|
COSLAM | C++ | GNU General Public License |
DSO-Direct Sparse Odometry | C++ | GPLv3 |
DTSLAM-Deferred Triangulation SLAM | C++ | modified BSD |
LSD-SLAM | C++/ROS | GNU General Public License |
MAPLAB-ROVIOLI | C++/ROS | Apachev2.0 |
OKVIS: Open Keyframe-based Visual-Inertial SLAM | C++ | BSD |
ORB-SLAM | C++ | GPLv3 |
REBVO - Realtime Edge Based Visual Odometry for a Monocular Camera | C++ | GNU General Public License |
SVO semi-direct Visual Odometry | C++/ROS | GNU General Public License |
Project | Language | License |
---|---|---|
DBoW2 | C++ | modified BSD License |
libvot | C++ | BSD 3-Clause License |
VocabTree2 | C++ | BSD License |
Project | Language | License |
---|---|---|
CERES SOLVER | C++ | BSD License |
GTSAM | C++ | BSD License |
G2O | C++ | BSD License + L/GPL3 restriction |
NLOPT | C++ | LGPL |
Project | Language | License |
---|---|---|
ANN | C++ | GNU General Public License |
Annoy | C++ | Apache License |
FLANN | C++ | BSD License |
Libnabo | C++ | BSD License |
Nanoflann | C++ | BSD License |
Project | Language | License |
---|---|---|
3DTK | C++ | GPLv3 |
CGAL | C++ | Module dependent GPL/LGPL |
InstantMesh Mesh Simplification | C++ | BSD License |
GEOGRAM | C++ | Revised BSD License |
libigl | C++ | MPL2 |
Mesh-processing-library | C++ | MIT License |
Open3D | C++ | MIT License |
OpenMesh | C++ | BSD 3 clause license |
PCL | C++ | 3-clause BSD license |
VCG | C++ | GPL |
From handcrafted to deep local features. G. Csurka, C. R. Dance, M. Humenberger. 2018.
Project | Detection | Description |
---|---|---|
AKAZE | x | MSURF/MLDB |
DART | x | x |
KAZE | x | MSURF/MLDB |
LIOP/MIOP | x | |
LIFT (machine learning) | x | x |
MROGH | x | |
SIFT | x | x |
SURF | x | x |
SFOP | x | |
... |
Project | Detection | Description |
---|---|---|
BRIEF | x | |
BRISK | x | x |
FAST | x | |
FREAK | x | |
FRIF | x | x |
HIPS | x | |
LATCH | x | |
MOPS | x | |
PhonySift | Multi-scale Fast | Reduced Sift grid |
ORB | Multiscale Fast | Oriented BRIEF |
VGG Oxford 8 dataset with GT homographies + matlab code.
Hannover - Region Detector Evaluation Data Set Similar to the previous (5 dataset). Datasets have multiple image resolution & an increased GT homographies precision.
DTU - Robot Image Data Sets - Point Feature Data Set 60 scenes with know calibration & different illuminations.
Corresponding patches, saved with a canonical scale and orientation.
Multi-view Stereo Correspondence Dataset
HPatches Dataset linked to the ECCV16 workshop "Local Features: State of the art, open problems and performance evaluation"
Mono dataset 50 real-world sequences. Dataset linked to the DSO Visual Odometry paper.
Middlebury Multi-view Stereo See "A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms". CVPR 2006.
Dense MVS See "On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery". CVPR 2008.
DTU - Robot Image Data Sets -MVS Data Set See “Large Scale Multi-view Stereopsis Evaluation“. CVPR 2014.
A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos in Unstructured Scenes, T. Schöps, J. L. Schönberger, S. Galiani, T. Sattler, K. Schindler, M. Pollefeys, A. Geiger,. CVPR 2017.
Tanks and Temples: Benchmarking Large-Scale Scene Reconstruction, A. Knapitsch, J. Park, Q.Y. Zhou and V. Koltun. SIGGRAPH 2017.
To the extent possible under law, Pierre Moulon has waived all copyright and related or neighboring rights to this work.
Please see CONTRIBUTING for details.