Holistic 3D Reconstruction

A list of papers and resources for holistic 3D reconstruction.

Datasets

Scene level

Datasets	#Scenes	#Rooms	#Frames	Annotated Structures
PlaneRCNN	~1,500	~1,500	100,000 (randomly sampled from 1 million)	planes
Replica	18	n/a	-	planes
Wireframe	-	-	5,462	wireframe (2D)
Wireframe Reconstruction	synthetic and real images	-	-	wireframe (3D)
SUN Primitive	-	-	785	cuboid, pyramid, cylinder, sphere, etc.
LSUN Room Layout	-	n/a	5,394	cuboid layout
PanoContext	-	n/a	500 (pano)	cuboid layout
LayoutNet	-	n/a	1,071 (pano)	cuboid layout
MatterportLayout	-	n/a	2,295 (RGB-D pano)	Manhattan layout
Floor-SP	100	707	~1500 (every scene has a set of RGB-D pano)	floorplan (with non-Manhattan structures)
FloorNet	~150	~1000	-	floorplan
Raster-to-Vector	870	-	-	floorplan
3RScan	478	-	-	objects
Structured3D	3,500	21,835	196,515	pritimitves (points/lines/planes) and relationships, 3D object instance bounding boxes

Object level

Datasets	#Images	#Categories	#3D models	Annotated Structures	Notes
Keypoint-5	8,649	5	-	keypoints
IKEA Keypoints	759		219	keypoints	derived from IKEA 3D
ANSI Mechanical Component	-	504	17,197	plane, sphere, cylinder, cone, etc.
PartNet	-	24	26,671	fine-grained, instance-level, and hierarchical 3D parts	derived from ShapeNet
PartNet-Symh	-	24	22,369	Symmetry hierarchical 3D parts	derived from PartNet
StructureNet	-	6	-	Symmetry hierarchical 3D parts	derived from PartNet

Datasets examples

PlaneRCNN

From left to right: input RGB image, planar segmentation, depthmap

Wireframe

First row: manually labelled line segments. Second row: groundtruth junctions

Wireframe Reconstruction

From left column to right column: input image with groundtruth wireframes, predicted 3D wireframe and alternative view of the same image

SUN Primitive

Yellow: groundtruth, green: correct detection, red: false alarm

LSUN Room Layout

From left right: input RGB image, room layout (corner-representation), room layout (segmentation-representation)

PanoContext

From left to right: a single-view panorama, object detection and 3D reconstruction

LayoutNet

Orange lines: predicted layout, Green lines: groundtruth layout

Raster-to-Vector

From left to right: an input floorplan image, reconstructed vector-graphics representation visualized by custom renderer, and a popup 3D model

Floor-SP

From left to right: stitched RGB-D panorama of indoor scenes & top-view point density/normal map, vector-graphics floorplan with non-Manhattan structures

Structured3D

(a) house designs (b) ground truth 3D structure annotations (c) photo-realistic 2D images

Keypoint-5 and IKEA Keypoints

Left: input image, right: labeled 2D keypoints

ANSI Mechanical Component

Up to down: input point cloud and geometric primitives

PartNet

From left column to right column:Three levels(from coarse to fine-grained) of segmentation annotations in the hierarchy,for three segmentation tasks

PartNet-Symh

Odd rows: groundtruth fine-grained segmentation results, even rows: prediction fine-grained segmentation results

References

Books

Y. Ma, S. Soatto, J. Kosecka, and S. S. Sastry. An Invitation to 3D Vision: From Images to Geometric Models. Springer Verlag, 2003.
R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, 2000.

Papers - Scene level

2020

Y. Nie, X. Han, S. Guo, Y. Zheng, J. Chang, and J. J. Zhang. Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes From a Single Image. In CVPR, 2020. [project]
Z. Jiang, B. Liu, S. Schulter, Z. Wang, and M. Chandraker. Peek-a-Boo: Occlusion Reasoning in Indoor Scenes With Plane Representations. In CVPR, 2020. [paper]
H. Zeng, K. Joseph, A. Vest, and Y. Furukawa. Bundle Pooling for Polygonal Architecture Segmentation Problem. In CVPR, 2020. [Paper]
F. Zhang, N. Nauata, and Y. Furukawa. Conv-MPN: Convolutional Message Passing Neural Network for Structured Outdoor Architecture Reconstruction. In CVPR, 2020. [project]
J. Zheng*, J. Zhang*, J. Li, R. Tang, S. Gao, and Z. Zhou. Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling. In ECCV, 2020. [project]
Y. Qian, and Y. Furukawa. Learning Inter-Plane Relations for Piecewise Planar Reconstruction. In ECCV, 2020.
Qian, Shengyi, Linyi Jin, and David F. Fouhey. Associative3D: Volumetric Reconstruction from Sparse Views. In ECCV, 2020. [project]
Pintore, Giovanni, Marco Agus, and Enrico Gobbetti. AtlantaNet: Inferring the 3D Indoor Layout from a Single 360 Image Beyond the Manhattan World Assumption. In ECCV, 2020. [project]
Avetisyan, Armen, et al. SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans. In ECCV, 2020. [paper]
Lin, Yancong, Silvia L. Pintea, and Jan C. van Gemert. Deep Hough-Transform Line Priors. In ECCV, 2020. [project]

2019

Y. Zhou, H. Qi, and Y. Ma. NeurVPS: Neural Vanishing Point Scanning via Conic Convolution. In NeurIPS, 2019. [project]
Y. Zhou, H. Qi, and Y. Ma. End-to-End Wireframe Parsing. In ICCV, 2019. [project]
Y. Zhou, H. Qi, Y. Zhai, Q. Sun, Z. Chen, L. Wei, and Y. Ma. Learning to Reconstruct 3D Manhattan Wireframes from a Single Image. In ICCV, 2019. [project]
J. Chen, C. Liu, J. Wu, and Y. Furukawa. Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path. In ICCV, 2019. [project]
J. Wald, A. Avetisyan, N. Navab, F. Tombari, and M. Niessner. RIO: 3D Object Instance Re-Localization in Changing Indoor Environments. In ICCV, 2019. [project]
C. Zou*, J.-W. Su*, C.-H. Peng, A. Colburn, Q. Shan, P. Wonka, H.-K. Chu, and D. Hoiem. 3D Manhattan Room Layout Reconstruction from a Single 360 Image, 2019. arXiv:1910.04099, 2019. [project]
C. Liu, K. Kim, J. Gu, Y. Furukawa, and J. Kautz. PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image. In CVPR, 2019. [project]
Z. Yu*, J. Zheng*, D. Lian, Z. Zhou, and S. Gao. Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding. In CVPR, 2019. [project]
Z. Zhang*, Z. Li*, N. Bi, J. Zheng, J. Wang, K. Huang, W. Luo, Y. Xu, and S. Gao. PPGNet: Learning Point-Pair Graph for Line Segment Detection. In CVPR, 2019. [project]

2018

F. Yang and Z. Zhou. Recovering 3D planes from a single image via convolutional neural networks. In ECCV, 2018. [project]
H. Zeng, J. Wu, and Y. Furukawa. Neural Procedural Reconstruction for Residential Buildings. In ECCV, 2018. [paper]
C. Liu*, J. Yu*, and Y. Furukawa. FloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans. In ECCV 2018. [project]
C. Zou, A. Colburn, Q. Shan, and D. Hoiem. LayoutNet: Reconstructing the 3d room layout from a single RGB image. In CVPR, 2018. [project]
C. Liu, J. Yang, D. Ceylan, E. Yumer, and Y. Furukawa. PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image. In CVPR, 2018. [project]
K. Huang, Y. Wang, Z. Zhou, T. Ding, S. Gao, and Y. Ma. Learning to parse wireframes in images of man-made environments. In CVPR, 2018. [project]
H. Fang, F. Lafarge and M. Desbrun. Planar Shape Detection at Structural Scales. In CVPR, 2018. [paper]

2017

L. Nan and P. Wonka. PolyFit: Polygonal Surface Reconstruction from Point Clouds. In ICCV, 2017. [project]
C. Liu, J. Wu, P. Kohli, and Y. Furukawa. Raster-to-Vector: Revisiting Floorplan Transformation. In ICCV, 2017. [project]
C. Lee, V. Badrinarayanan, T. Malisiewicz, and A. Rabinovich. RoomNet: End-to-end room layout estimation. In ICCV, 2017. [paper]
H. Izadinia, Q. Shan, and S. M. Seitz. IM2CAD. In CVPR, 2017. [project]
E. Wijmans and Y. Furukawa. Exploiting 2D Floorplan for Building-scale Panorama RGBD Alignment. In CVPR, 2017. [project]
T. Kelly, J. Femiani, P. Wonka, and N. Mitra. BigSUR: Large-scale Structured Urban Reconstruction. In SIGGRAPH Asia, 2017. [paper]

2016

M. Li, P. Wonka, and L. Nan. Manhattan-world Urban Reconstruction from Point Clouds. In ECCV, 2016. [paper]
C. Zhu, Z. Zhou, Z. Xing, Y. Dong, Y. Ma, and J. Yu. Robust Plane-based Calibration of Multiple Non-Overlapping Cameras. In 3DV, 2016. [paper]
C. Liu, P. Kohli, and Y. Furukawa. Layered Scene Decomposition via the Occlusion-CRF. In CVPR, 2016. [project]
S. Dasgupta, K. Fang, K. Chen, and S. Savarese. Delay: Robust spatial layout estimation for cluttered indoor scenes. In CVPR, 2016. [paper]
S. Oesau, F. Lafarge and P. Alliez. Planar Shape Detection and Regularization in Tandem. Computer Graphics Forum, 2016. [paper]

2015

S. Ikehata, H. Yan, and Y. Furukawa. Structured Indoor Modeling. In ICCV, 2015. [paper]
O. Haines and A. Calway. Recognising planes in a single image. IEEE TPAMI, 2015. [paper]
A. Monszpart, N. Mellado, G. J. Brostow, and N. J. Mitra. RAPTER: Rebuilding Man-made Scenes with Regular Arrangements of Planes. In SIGGRAPH, 2015. [paper]
J. Favreau, F. Lafarge, and A. Bousseau. Line Drawing Interpretation in a Multi-View Context. In CVPR, 2015.

2014

D. F. Fouhey, A. Gupta, and M. Hebert. Unfolding an indoor origami world. In ECCV, 2014. [paper]
R. Cabral and Y. Furukawa. Piecewise Planar and Compact Floorplan Reconstruction from Images. In CVPR 2014. [paper]
D. Ceylan, N. J. Mitra, Y. Zheng, M. Pauly. Coupled Structure-from-Motion and 3D Symmetry Detection for Urban Facades. ACM Transactions on Graphics, 2014. [paper]

2013

S. Ramalingam and M. Brand. Lifting 3D manhattan lines from a single image. In ICCV, 2013. [paper]
S. Ramalingam, J. K. Pillai, A. Jain, and Y. Taguchi. Manhattan junction catalogue for spatial reasoning of indoor scenes. In CVPR, 2013. [paper]
Z. Zhou, H. Jin, and Y. Ma. Plane-Based Content-Preserving Warps for Video Stabilization. In CVPR, 2013. [paper]
N. J. Mitra, M. Pauly, M. Wand, and D. Ceylan. Symmetry in 3D Geometry: Extraction and Applications. Computer Graphics Forum, 2013. [paper]

2012

J. Xiao, B. C. Russell, and A. Torralba. Localizing 3d cuboids in single-view images. In NIPS, 2012. [paper]
J. Xiao and Y. Furukawa. Reconstructing the World's Museums. In ECCV, 2012. [paper]
Z. Zhou, H. Jin, and Y. Ma. Robust Plane-Based Structure From Motion. In CVPR, 2012. [paper]
A. Cohen, C. Zach, S. N. Sinha and M. Pollefeys. Discovering and exploiting 3D symmetries in structure from motion. In CVPR, 2012. [paper]
C. A. Vanegas, D. G. Aliaga, and B. Benes. Automatic Extraction of Manhattan-World Building Masses from 3D Laser Range Scans. IEEE TVCG, 2012. [paper]

2011

H. Mobahi, Z. Zhou, A. Y. Yang, and Y. Ma. Holistic Reconstruction of Urban Structures from Low-rank Textures. In ICCV-3dRR, 2011. [paper]
Z. Zhang, X. Liang, and Y. Ma. Unwrapping Low-rank Textures on Generalized Cylindrical Surfaces. In ICCV, 2011. [paper]
A. Flint, D. W. Murray, and I. Reid. Manhattan scene understanding using monocular, stereo, and 3D features. In ICCV, 2011. [paper]
C. Wu, J.-M. Frahm, and M. Pollefeys. Repetition-based dense single-view reconstruction. In CVPR, 2011. [paper]
A. Elqursh and A. M. Elgammal. Line-based relative pose estimation. In CVPR, 2011. [paper]
Z. Zhang, Y. Matsushita, and Y. Ma. Camera Calibration with Lens Distortion from Low-rank Textures. In CVPR, 2011. [paper]

2010 and before

D. Gallup, J.-M. Frahm, and M. Pollefeys. Piecewise Planar and Non-Planar Stereo for Urban Scene Reconstruction. In CVPR, 2010. [paper]
Y. Furukawa, B. Curless, S. M. Seitz and R. Szeliski. Reconstructing Building Interiors from Images. In ICCV, 2009. [paper]
V. Hedau, D. Hoiem, and D. A. Forsyth. Recovering the spatial layout of cluttered rooms. In ICCV, 2009. [paper]
Y. Furukawa, B. Curless, S. M. Seitz, and R. Szeliski. Manhattan-world stereo. In CVPR, 2009. [paper]
D.C. Lee, M. Hebert, and T. Kanade. Geometric Reasoning for Single Image Structure Recovery. In CVPR, 2009. [paper]
G. Schindler, P. Krishnamurthy, R. Lublinerman, Y. Liu, and F. Dellaert. Detecting and Matching Repeated Patterns for Automatic Geo-tagging in Urban Environments. In CVPR, 2008. [paper]
B. Micusik, H. Wildenauer, and J. Kosecka. Detection and matching of rectilinear structures. In CVPR, 2008. [paper]
D. Hoiem, A. A. Efros, and M. Hebert. Recovering surface layout from an image. IJCV, 2007. [paper]
G. Schindler, P. Krishnamurthy, and F. Dellaert. Line-Based Structure From Motion for Urban Environments. In 3DPVT, 2006. [paper]
J. M. Coughlan and A. L. Yuille. Manhattan world: Orientation and outlier detection by bayesian inference. Neural Computation, 2003. [paper]
A. Bartoli and P. Sturm. Constrained structure and motion from multiple uncalibrated views of a piecewise planar scene. IJCV, 2003. [paper]
J. Kosecka, and W. Zhang. Video Compass. In ECCV, 2002. [paper]
A. P. Witkin and J. M. Tenenbaum. On the role of structure in vision. In J. Beck, B. Hope, and A. Rosenfeld, editors, Human and Machine Vision, pages 481–543. Academic Press, 1983. [paper]

Papers - Object level