Awesome Scene Understanding

A curated list of awesome scene understanding papers, inspired by awesome-computer-vision.

📷 Multi-view images
🎲 Point cloud

Related Resources

Workshops and Tutorials

Survey

Papers	Venue	Links
Advances in Data-Driven Analysis and Synthesis of 3D Indoor Scenes	CGF 2023	-
State-of-the-art in Automatic 3D Reconstruction of Structured Indoor Environments	CGF 2020	[project]
Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A Survey	IEEE Access 2019	-
RGBD Datasets: Past, Present and Future	CVPR Workshop 2016	[project]

Dataset

Realistic Dataset

Papers	Venue	Links
ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes	ICCV 2023	[project]
ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data	NeurIPS 2021 Dataset Track	[code]
Zillow Indoor Dataset: Annotated Floor Plans With 360˚ Panoramas and 3D Room Layouts	CVPR 2021	[code]
HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures	CoRR 2020	[project]
OASIS: A Large-Scale Dataset for Single Image 3D in the Wild	CVPR 2020	[project]
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera	ICCV 2019	[project]
The Replica Dataset: A Digital Replica of Indoor Spaces	CoRR 2019	[code]
Matterport3D: Learning from RGB-D Data in Indoor Environments	3DV 2017	[project]
Joint 2D-3D-Semantic Data for Indoor Scene Understanding	CoRR 2017	[project]
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes	CVPR 2017	[project]
SceneNN: a Scene Meshes Dataset with aNNotations	3DV 2016	[project]
SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite	CVPR 2015	[project]
SUN3D: A Database of Big Spaces Reconstructed using SfM and Object Labels	ICCV 2013	[project]
Indoor Segmentation and Support Inference from RGBD Images	ECCV 2012	[project]

Synthetic Dataset

Papers	Venue	Links
Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation	CVPR 2024	[project]
R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding	CoRR 2024	[project]
FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing Scenes	CoRR 2024	-
GeoSynth: A Photorealistic Synthetic Indoor Dataset for Scene Understanding	VR 2023	[code]
MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis	CGF 2022	[project]
3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics	ICCV 2021	[project]
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding	ICCV 2021	[project]
OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets	CVPR 2021	[project]
Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling	ECCV 2020	[project]
InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset	BMVC 2018	[project]
SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation?	ICCV 2017	[project]
Semantic Scene Completion from a Single Depth Image	CVPR 2017	-
SceneNet: Understanding Real World Indoor Scenes With Synthetic Data	CVPR 2016	[project]
The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes	CVPR 2016	[project]

Holistic Scene Understanding

Perspective Image

Papers	Venue	Links
Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture	3DV 2024	[project]
Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes	ECCV 2022	[code]
Holistic 3D Scene Understanding from a Single Image with Implicit Representation	CVPR 2021	[project] [code]
Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image	CVPR 2020	[code]
PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points	NeurIPS 2019	-
Hoilistc++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense	ICCV 2019	[project] [code]
Complete 3D Scene Parsing from an RGBD Image	IJCV 2018	-
Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation	NeurIPS 2018	[project] [code]
Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image	ECCV 2018	[project] [code]
Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene	CVPR 2018	[project] [code]
Im2CAD	CVPR 2018	[project]
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding	ICCV 2017	[project]
Emptying, Refurnishing, and Relighting Indoor Spaces	SIGGRAPH Asia 2016	[project]
Scene Parsing by Integrating Function, Geometry and Appearance Models	CVPR 2013	-
Understanding Indoor Scenes using 3D Geometric Phrases	(CVPR 2013)	-
Recovering Free Space of Indoor Scenes from a Single Image	CVPR 2012	-
Efficient Exact Inference for 3D Indoor Scene Understanding	ECCV 2012	-
Efficient Structured Prediction for 3D Indoor Scene Understanding	CVPR 2012	-
Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces	NeurIPS 2010	-
Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry	ECCV 2010	-

Panoramic Image

Papers	Venue	Links
PanoContext-Former: Panoramic Total Scene Understanding with a Transformer	CVPR 2024	-
PanelNet: Understanding 360 Indoor Environment via Panel Representation	CVPR 2023	-
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization	ICCV 2021	[code]
HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features	CVPR 2021	[Code]
Automatic 3D Indoor Scene Modeling from Single Panorama	CVPR 2018	-
Pano2CAD: Room Layout From A Single Panorama Image	WACV 2017	-
PanoContext: A Whole-room 3D Context Model for Panoramic Scene Understanding	ECCV 2014	[project]

Room Layout Estimation

Perspective Image

(AW: Atlanta-world, SS: single-floor and single-ceiling, PP: Piece-wise Planarity.)

Dataset	Year	Modality	#Frames	Prior	Source
CAD-Estate	2023	RGB Video	-	Generic	RealEstate-10K
Matterport3D-Layout	2020	RGB-D	7360	PP	Matterport
ScanNet-Layout	2020	RGB-D	293	PP	ScanNet
Structured3D	2020	RGB-D	82027	AW+SS	Structured3D
LSUN Room Layout	2016	RGB	5394	Cuboid	SUN
SUN RGB-D	2015	RGB-D	10335	AW+SS	NYUv2, Berkeley B3DO, and SUN3D
NYUv2 303	2013	RGB-D	303	Cuboid	NYUv2
Hedau	2009	RGB	366	Cuboid	-

Papers	Venue	Links
Polygon Detection for Room Layout Estimation using Heterogeneous Graphs and Wireframes	ICCV Workshop 2023	[code]
ST-RoomNet: Learning Room Layout Estimation From Single Image Through Unsupervised Spatial Transformations	CVPR Workshop 2023	-
Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image	WACV 2022	[code]
RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single View	CoRR 2021	-
GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes	ECCV 2020	[Matterport3D Layout Dataset]
Structural Deep Metric Learning for Room Layout Estimation	ECCV 2020	-
General 3D Room Layout from a Single View by Render-and-Compare	ECCV 2020	[project] [ScanNet-Layout Dataset] [code]
Smart Hypothesis Generation for Efficient and Robust Room Layout Estimation	WACV 2020	-
Flat2Layout: Flat Representation for Estimating Layout of General Room Types	CoRR 2019	-
Thinking Outside the Box: Generation of Unconstrained 3D Room Layouts	ACCV 2018	-
RoomNet: End-to-End Room Layout Estimation	ICCV 2017	-
Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation	CVPR 2017	[project]
A Coarse-to-Fine Indoor Layout Estimation (CFILE) Method	ACCV 2016	-
DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes	CVPR 2016	-
Learning Informative Edge Maps for Indoor Scene Layout Prediction	ICCV 2015	-
Rent3D: Floor-Plan Priors for Monocular Layout Estimation	CVPR 2015	[project]
Box In the Box: Joint 3D Layout and Object Reasoning from Single Images	CVPR 2013	-
Estimating the 3D Layout of Indoor Scenes and its Clutter from Depth Sensors	ICCV 2013	[project]
Recovering the Spatial Layout of Cluttered Rooms	ICCV 2009	-

Panoramic Image

(MW: Manhattan world, AW: Atlanta world, SS: single-floor and single-ceiling.)

Dataset	Year	Modality	#Frames	Prior	Source
ZInD	2021	RGB	71474	AW+SS	ZinD
MatterportLayout	2020	RGB-D	2295	MW+SS	Matterport
Structured3D	2020	RGB-D	196515	AW+SS	Structured3D
LayoutMP3D	2020	RGB-D	2505	MW+SS	Matterport
2D-3D-S	2018	RGB-D	571	Cuboid	2D-3D-S
PanoContext	2014	RGB	500	Cuboid	SUN360

Papers	Venue	Links
No More Ambiguity in 360◦ Room Layout via Bi-Layout Estimation	CVPR 2024
Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction	CVPR 2024
iBARLE: imBalance-Aware Room Layout Estimation	CoRR 2023
📷 GPR-Net: Multi-view Layout Estimation via a Geometry-aware Panorama Registration Network	CVPR Workshop 2023	-
Shape-Net: Room Layout Estimation from Panoramic Images Robust to Occlusion using Knowledge Distillation with 3D Shapes as Additional Inputs	CVPR Workshop 2023
U2RLE: Uncertainty-Guided 2-Stage Room Layout Estimation	CVPR 2023
Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness	CVPR 2023	[Code]
📷 360-MLC: Multi-view Layout Consistency for Self-training and Hyper-parameter Tuning	NeurIPS 2022	[Project]
3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform	ECCV 2022	[Code]
3D Room Layout Recovery Generalizing across Manhattan and Non-Manhattan Worlds	CVPR 2022	-
📷 PSMNet: Position-aware Stereo Merging Network for Room Layout Estimation	CVPR 2022	[code]
Self-supervised 360˚ Room Layout Estimation	CoRR 2022	[code]
LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network	CVPR 2022	-
Deep3DLayout: 3D Reconstruction of an Indoor Layout from a Spherical Panoramic Image	SIGGRAPH Asia 2021	[project]
Transferable End-to-end Room Layout Estimation via Implicit Encoding	CoRR 2021	[project]
OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas	CVPR Workshop 2021	[code]
LED²-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering	CVPR 2021	[project] [code]
SSLayout360: Semi-Supervised Indoor Layout Estimation from 360 Panorama	CVPR 2021	-
Single-Shot Cuboids: Geodesics-based End-to-end Manhattan Aligned Layout Estimation from Spherical Panoramas	Image and Vision Computing 2021	[project] [code]
Manhattan Room Layout Reconstruction from a Single 360 image: A Comparative Study of State-of-the-art Methods	IJCV 2021	[code] [MatterportLayout Dataset]
Training and Post Processing 3D Room Layout Beyond the Manhattan World Assumption	ECCV Workshop 2020	-
Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image	ECCV 2020	-
AtlantaNet: Inferring the 3D Indoor Layout from a Single 360 Image Beyond the Manhattan World Assumption	ECCV 2020	[project] [code]
Corners for Layout: End-to-End Layout Recovery from 360 Images	ICRA 2019	[project] [code]
DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama	CVPR 2019	[project]
HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation	CVPR 2019	[code]
Layouts from Panoramic Images with Geometry and Deep Learning	IROS 2018	[code]
LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image	(CVPR 2018)	[code]
Efficient 3D Room Shape Recovery From a Single Panorama	CVPR 2016	[code]

Floorplan

Papers	Venue	Links
🎲 FRI-Net: Floorplan Reconstruction via Room-wise Implicit Representation	ECCV 2024	[code]
🎲 PolyRoom: Room-aware Transformer for Floorplan Reconstruction	ECCV 2024	[code]
🎲 PolyDiffuse: Polygonal Shape Reconstruction via Guided Set Diffusion Models	NeurIPS 2023	[project]
🎲 Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries	CVPR 2023	[project] [code]
📷 Floorplan Restoration by Structure Hallucinating Transformer Cascades	CoRR 2022	-
📷 MVLayoutNet: 3D Layout Reconstruction with Multi-View Panoramas	CoRR 2021	-
📷 Extreme Structure From Motion for Indoor Panoramas Without Visual Overlaps	ICCV 2021	[code]
🎲 MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans	ICCV 2021	-
🎲 Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes	CoRR 2020	-
🎲 Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path	ICCV 2019	[project] [code]
📷 Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial Scans	ICCV 2019	[project]
🎲 DeepPerimeter: Indoor Boundary Estimation from Posed Monocular Sequences	CoRR 2019	-
📷 FloorNet: A unified framework for floorplan reconstruction from 3D scans	ECCV 2018	[project] [code]

Floorplan Vectorization

Papers	Venue	Links
VectorFloorSeg: Two-Stream Graph Attention Network for Vectorized Roughcast Floorplan Segmentation	CVPR 2023	[code]
Parsing Line Segments of Floor Plan Images Using Graph Neural Networks	CoRR 2023	-
Residential floor plan recognition and reconstruction	CVPR 2021	-
Versailles-FP dataset: Wall Detection in Ancient Floor Plans	CoRR 2021	-
Deep Floor Plan Recognition using a Multi-task Network with Room-boundary-Guided Attention	ICCV 2019	[project]
CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image Analysis	Scandinavian Conference on Image Analysis 2019	[code]
Raster-to-Vector: Revisiting Floorplan Transformation	ICCV 2017	[project] [code]

Visual Localization

Papers	Venue	Links
SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments	ECCV 2024	[project] [code]
LaLaLoc++: Global Floor Plan Comprehension for Layout Localisation in Unvisited Environments	ECCV 2022	[code]
LASER: LAtent SpacE Rendering for 2D Visual Localization	CVPR 2022	-
LaLaLoc: Latent Layout Localisation in Dynamic, Unvisited Environments	ICCV 2021	-

Primitive

Junction

Papers	Venue	Links
Manhattan Junction Catalogue for Spatial Reasoning of Indoor Scenes	CVPR 2013	-

Line Segment and Wireframe

Papers	Venue	Links
📷Volumetric Wireframe Parsing from Neural Attraction Fields	CoRR 2023	[code]
📷NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images	CVPR 2023	[project]
DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients	CoRR 2022	[Code]
Holistically-Attracted Wireframe Parsing: From Supervised to Self-Supervised Learning	CoRR 2022	-
🎲Learning to Construct 3D Building Wireframes from 3D Line Clouds	BMVC 2022	[Code]
HoW-3D: Holistic 3D Wireframe Perception from a Single Image	3DV 2022	[Code]
Semantic Room Wireframe Detection from a Single View	ICPR 2022	[code]
Towards Real-time and Light-weight Line Segment Detection	AAAI 2022	[code]
Hole-robust Wireframe Detection	WACV 2022	-
Fully Convolutional Line Parsing	Neurocomputing 2022	[code]
ELSD: Efficient Line Segment Detector and Descriptor	ICCV 2021	-
SOLD²: Self-supervised Occlusion-aware Line Description and Detection	CVPR 2021	[code]
Line Segment Detection Using Transformers without Edges	CVPR 2021	[code]
PlueckerNet: Learn to Register 3D Line Reconstructions	CVPR 2020	[code]
LGNN: A Context-aware Line Segment Detector	ACM MM 2020	-
TP-LSD: Tri-Points Based Line Segment Detector	ECCV 2020	[code]
Deep Hough-Transform Line Priors	ECCV 2020	[code]
Deep Hough Transform for Semantic Line Detection	ECCV 2020	[code]
Holistically-Attracted Wireframe Parsing	CVPR 2020	[code]
Learning to Reconstruct 3D Manhattan Wireframes from a Single Image	ICCV 2019	[code]
End-to-End Wireframe Parsing	ICCV 2019	[code]
PPGNet: Learning Point-Pair Graph for Line Segment Detection	CVPR 2019	[code]
Learning Attraction Field Representation for Robust Line Segment Detection	CVPR 2019	[code]
Novel Single View Constraints for Manhattan 3D Line Reconstruction	3DV 2018	-
Learning to Parse Wireframes in Images of Man-Made Environments	CVPR 2018	[code]
A Novel Linelet-Based Representation for Line Segment Detection	TPAMI 2018	-
MCMLSD: A Dynamic Programming Approach to Line Segment Detection	CVPR 2017	-
Lifting 3D Manhattan Lines from a Single Image	ICCV 2013	-
LSD: A Fast Line Segment Detector with a False Detection Control	TPAMI 2010	-

Outdoor Architecture

Papers	Venue	Links
HEAT: Holistic Edge Attention Transformer for Structured Reconstruction	CVPR 2022	[Project]
Structured Outdoor Architecture Reconsruction by Exploration and Classification	ICCV 2021	[Project]
Roof-GAN: Learning to Generate Roof Geometry and Relations for Residential Houses	CVPR 2021	[Code]
Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship Inference	ECCV 2020	[Project]
Conv-MPN: Convolutional Message Passing Neural Network for Structured Outdoor Architecture Reconstruction	CVPR 2020	[Project]

Plane

Papers	Venue	Links
📷 UniPlane: Unified Plane Detection and Reconstruction from Posed Monocular Videos	CoRR 2024
📷 AirPlanes: Accurate Plane Estimation via 3D-Consistent Embeddings	CVPR 2024	[project]
PlaneRecTR: Unified Query learning for 3D Plane Recovery from a Single View	ICCV 2023	[Code]
📷 NOPE-SAC: Neural One-Plane RANSAC for Sparse-View Planar 3D Reconstruction	CoRR 2022	[Code]
📷 PlaneFormers: From Sparse View Planes to 3D Reconstruction	ECCV 2022	[project] [code]
📷 PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos	CVPR 2022	[Project]
PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image	BMVC 2021	[code]
PlaneTR: Structure-Guided Transformers for 3D Plane Recovery	ICCV 2021	[code]
📷 Planar Surface Reconstruction From Sparse Views	ICCV 2021	[project] [code]
Indoor Panorama Planar 3D Reconstruction via Divide and Conquer	CVPR 2021	[code]
Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction	ECCV 2020	[code]
Peek-a-Boo: Occlusion Reasoning in Indoor Scenes with Plane Representations	CVPR 2020	[project]
Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding	CVPR 2019	[code]
PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image	CVPR 2019	[project] [code]
Recovering 3D Planes from a Single Image via Convolutional Neural Networks	ECCV 2018	[code]
PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image	CVPR 2018	[project] [code]

Vanishing Point

Papers	Venue	Links
Vanishing Point Estimation in Uncalibrated Images with Prior Gravity Direction	ICCV 2023	[code]
Transformer Based Line Segment Classifier with Image Context for Real-Time Vanishing Point Detection in Manhattan World	CVPR 2022	-
Deep Vanishing Point Detection: Geometric Priors Make Dataset Variations Vanish	CVPR 2022	-
VaPiD: A Rapid Vanishing Point Detector via Learned Optimizers	ICCV 2021	-
NeurVPS: Neural Vanishing Point Scanning via Conic Convolution	NeurIPS 2021	[Code]

imranmu/awesome-scene-understanding

Awesome Scene Understanding

Related Resources

Workshops and Tutorials

Survey

Dataset

Realistic Dataset

Synthetic Dataset

Holistic Scene Understanding

Perspective Image

Panoramic Image

Room Layout Estimation

Perspective Image

Panoramic Image

Floorplan

Floorplan Vectorization

Visual Localization

Primitive

Junction

Line Segment and Wireframe

Outdoor Architecture

Plane

Vanishing Point