/awesome-scene-understanding

😎 A list of awesome scene understanding papers.

MIT LicenseMIT

Awesome Scene Understanding Awesome

A curated list of awesome scene understanding papers, inspired by awesome-computer-vision.

  • 📷 Multi-view images
  • 🎲 Point cloud

Related Resources

Workshops and Tutorials

Survey

Papers Venue Links
Advances in Data-Driven Analysis and Synthesis of 3D Indoor Scenes CGF 2023 -
State-of-the-art in Automatic 3D Reconstruction of Structured Indoor Environments CGF 2020 [project]
Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A Survey IEEE Access 2019 -
RGBD Datasets: Past, Present and Future CVPR Workshop 2016 [project]

Dataset

Realistic Dataset

Papers Venue Links
ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes ICCV 2023 [project]
ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data NeurIPS 2021 Dataset Track [code]
Zillow Indoor Dataset: Annotated Floor Plans With 360Ëš Panoramas and 3D Room Layouts CVPR 2021 [code]
HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures CoRR 2020 [project]
OASIS: A Large-Scale Dataset for Single Image 3D in the Wild CVPR 2020 [project]
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera ICCV 2019 [project]
The Replica Dataset: A Digital Replica of Indoor Spaces CoRR 2019 [code]
Matterport3D: Learning from RGB-D Data in Indoor Environments 3DV 2017 [project]
Joint 2D-3D-Semantic Data for Indoor Scene Understanding CoRR 2017 [project]
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes CVPR 2017 [project]
SceneNN: a Scene Meshes Dataset with aNNotations 3DV 2016 [project]
SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite CVPR 2015 [project]
SUN3D: A Database of Big Spaces Reconstructed using SfM and Object Labels ICCV 2013 [project]
Indoor Segmentation and Support Inference from RGBD Images ECCV 2012 [project]

Synthetic Dataset

Papers Venue Links
Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation CVPR 2024 [project]
R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding CoRR 2024 [project]
FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing Scenes CoRR 2024 -
GeoSynth: A Photorealistic Synthetic Indoor Dataset for Scene Understanding VR 2023 [code]
MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis CGF 2022 [project]
3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics ICCV 2021 [project]
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding ICCV 2021 [project]
OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets CVPR 2021 [project]
Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling ECCV 2020 [project]
InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset BMVC 2018 [project]
SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation? ICCV 2017 [project]
Semantic Scene Completion from a Single Depth Image CVPR 2017 -
SceneNet: Understanding Real World Indoor Scenes With Synthetic Data CVPR 2016 [project]
The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes CVPR 2016 [project]

Holistic Scene Understanding

Perspective Image

Papers Venue Links
Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture 3DV 2024 [project]
Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes ECCV 2022 [code]
Holistic 3D Scene Understanding from a Single Image with Implicit Representation CVPR 2021 [project] [code]
Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image CVPR 2020 [code]
PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points NeurIPS 2019 -
Hoilistc++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense ICCV 2019 [project] [code]
Complete 3D Scene Parsing from an RGBD Image IJCV 2018 -
Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation NeurIPS 2018 [project] [code]
Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image ECCV 2018 [project] [code]
Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene CVPR 2018 [project] [code]
Im2CAD CVPR 2018 [project]
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding ICCV 2017 [project]
Emptying, Refurnishing, and Relighting Indoor Spaces SIGGRAPH Asia 2016 [project]
Scene Parsing by Integrating Function, Geometry and Appearance Models CVPR 2013 -
Understanding Indoor Scenes using 3D Geometric Phrases (CVPR 2013) -
Recovering Free Space of Indoor Scenes from a Single Image CVPR 2012 -
Efficient Exact Inference for 3D Indoor Scene Understanding ECCV 2012 -
Efficient Structured Prediction for 3D Indoor Scene Understanding CVPR 2012 -
Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces NeurIPS 2010 -
Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry ECCV 2010 -

Panoramic Image

Papers Venue Links
PanoContext-Former: Panoramic Total Scene Understanding with a Transformer CVPR 2024 -
PanelNet: Understanding 360 Indoor Environment via Panel Representation CVPR 2023 -
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization ICCV 2021 [code]
HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features CVPR 2021 [Code]
Automatic 3D Indoor Scene Modeling from Single Panorama CVPR 2018 -
Pano2CAD: Room Layout From A Single Panorama Image WACV 2017 -
PanoContext: A Whole-room 3D Context Model for Panoramic Scene Understanding ECCV 2014 [project]

Room Layout Estimation

Perspective Image

(AW: Atlanta-world, SS: single-floor and single-ceiling, PP: Piece-wise Planarity.)

Dataset Year Modality #Frames Prior Source
CAD-Estate 2023 RGB Video - Generic RealEstate-10K
Matterport3D-Layout 2020 RGB-D 7360 PP Matterport
ScanNet-Layout 2020 RGB-D 293 PP ScanNet
Structured3D 2020 RGB-D 82027 AW+SS Structured3D
LSUN Room Layout 2016 RGB 5394 Cuboid SUN
SUN RGB-D 2015 RGB-D 10335 AW+SS NYUv2, Berkeley B3DO, and SUN3D
NYUv2 303 2013 RGB-D 303 Cuboid NYUv2
Hedau 2009 RGB 366 Cuboid -
Papers Venue Links
Polygon Detection for Room Layout Estimation using Heterogeneous Graphs and Wireframes ICCV Workshop 2023 [code]
ST-RoomNet: Learning Room Layout Estimation From Single Image Through Unsupervised Spatial Transformations CVPR Workshop 2023 -
Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image WACV 2022 [code]
RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single View CoRR 2021 -
GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes ECCV 2020 [Matterport3D Layout Dataset]
Structural Deep Metric Learning for Room Layout Estimation ECCV 2020 -
General 3D Room Layout from a Single View by Render-and-Compare ECCV 2020 [project] [ScanNet-Layout Dataset] [code]
Smart Hypothesis Generation for Efficient and Robust Room Layout Estimation WACV 2020 -
Flat2Layout: Flat Representation for Estimating Layout of General Room Types CoRR 2019 -
Thinking Outside the Box: Generation of Unconstrained 3D Room Layouts ACCV 2018 -
RoomNet: End-to-End Room Layout Estimation ICCV 2017 -
Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation CVPR 2017 [project]
A Coarse-to-Fine Indoor Layout Estimation (CFILE) Method ACCV 2016 -
DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes CVPR 2016 -
Learning Informative Edge Maps for Indoor Scene Layout Prediction ICCV 2015 -
Rent3D: Floor-Plan Priors for Monocular Layout Estimation CVPR 2015 [project]
Box In the Box: Joint 3D Layout and Object Reasoning from Single Images CVPR 2013 -
Estimating the 3D Layout of Indoor Scenes and its Clutter from Depth Sensors ICCV 2013 [project]
Recovering the Spatial Layout of Cluttered Rooms ICCV 2009 -

Panoramic Image

(MW: Manhattan world, AW: Atlanta world, SS: single-floor and single-ceiling.)

Dataset Year Modality #Frames Prior Source
ZInD 2021 RGB 71474 AW+SS ZinD
MatterportLayout 2020 RGB-D 2295 MW+SS Matterport
Structured3D 2020 RGB-D 196515 AW+SS Structured3D
LayoutMP3D 2020 RGB-D 2505 MW+SS Matterport
2D-3D-S 2018 RGB-D 571 Cuboid 2D-3D-S
PanoContext 2014 RGB 500 Cuboid SUN360
Papers Venue Links
No More Ambiguity in 360â—¦ Room Layout via Bi-Layout Estimation CVPR 2024
Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction CVPR 2024
iBARLE: imBalance-Aware Room Layout Estimation CoRR 2023
📷 GPR-Net: Multi-view Layout Estimation via a Geometry-aware Panorama Registration Network CVPR Workshop 2023 -
Shape-Net: Room Layout Estimation from Panoramic Images Robust to Occlusion using Knowledge Distillation with 3D Shapes as Additional Inputs CVPR Workshop 2023
U2RLE: Uncertainty-Guided 2-Stage Room Layout Estimation CVPR 2023
Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness CVPR 2023 [Code]
📷 360-MLC: Multi-view Layout Consistency for Self-training and Hyper-parameter Tuning NeurIPS 2022 [Project]
3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform ECCV 2022 [Code]
3D Room Layout Recovery Generalizing across Manhattan and Non-Manhattan Worlds CVPR 2022 -
📷 PSMNet: Position-aware Stereo Merging Network for Room Layout Estimation CVPR 2022 [code]
Self-supervised 360Ëš Room Layout Estimation CoRR 2022 [code]
LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network CVPR 2022 -
Deep3DLayout: 3D Reconstruction of an Indoor Layout from a Spherical Panoramic Image SIGGRAPH Asia 2021 [project]
Transferable End-to-end Room Layout Estimation via Implicit Encoding CoRR 2021 [project]
OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas CVPR Workshop 2021 [code]
LED2-Net: Monocular 360Ëš Layout Estimation via Differentiable Depth Rendering CVPR 2021 [project] [code]
SSLayout360: Semi-Supervised Indoor Layout Estimation from 360 Panorama CVPR 2021 -
Single-Shot Cuboids: Geodesics-based End-to-end Manhattan Aligned Layout Estimation from Spherical Panoramas Image and Vision Computing 2021 [project] [code]
Manhattan Room Layout Reconstruction from a Single 360 image: A Comparative Study of State-of-the-art Methods IJCV 2021 [code] [MatterportLayout Dataset]
Training and Post Processing 3D Room Layout Beyond the Manhattan World Assumption ECCV Workshop 2020 -
Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image ECCV 2020 -
AtlantaNet: Inferring the 3D Indoor Layout from a Single 360 Image Beyond the Manhattan World Assumption ECCV 2020 [project] [code]
Corners for Layout: End-to-End Layout Recovery from 360 Images ICRA 2019 [project] [code]
DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama CVPR 2019 [project]
HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation CVPR 2019 [code]
Layouts from Panoramic Images with Geometry and Deep Learning IROS 2018 [code]
LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image (CVPR 2018) [code]
Efficient 3D Room Shape Recovery From a Single Panorama CVPR 2016 [code]

Floorplan

Papers Venue Links
🎲 FRI-Net: Floorplan Reconstruction via Room-wise Implicit Representation ECCV 2024 [code]
🎲 PolyRoom: Room-aware Transformer for Floorplan Reconstruction ECCV 2024 [code]
🎲 PolyDiffuse: Polygonal Shape Reconstruction via Guided Set Diffusion Models NeurIPS 2023 [project]
🎲 Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries CVPR 2023 [project] [code]
📷 Floorplan Restoration by Structure Hallucinating Transformer Cascades CoRR 2022 -
📷 MVLayoutNet: 3D Layout Reconstruction with Multi-View Panoramas CoRR 2021 -
📷 Extreme Structure From Motion for Indoor Panoramas Without Visual Overlaps ICCV 2021 [code]
🎲 MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans ICCV 2021 -
🎲 Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes CoRR 2020 -
🎲 Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path ICCV 2019 [project] [code]
📷 Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial Scans ICCV 2019 [project]
🎲 DeepPerimeter: Indoor Boundary Estimation from Posed Monocular Sequences CoRR 2019 -
📷 FloorNet: A unified framework for floorplan reconstruction from 3D scans ECCV 2018 [project] [code]

Floorplan Vectorization

Papers Venue Links
VectorFloorSeg: Two-Stream Graph Attention Network for Vectorized Roughcast Floorplan Segmentation CVPR 2023 [code]
Parsing Line Segments of Floor Plan Images Using Graph Neural Networks CoRR 2023 -
Residential floor plan recognition and reconstruction CVPR 2021 -
Versailles-FP dataset: Wall Detection in Ancient Floor Plans CoRR 2021 -
Deep Floor Plan Recognition using a Multi-task Network with Room-boundary-Guided Attention ICCV 2019 [project]
CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image Analysis Scandinavian Conference on Image Analysis 2019 [code]
Raster-to-Vector: Revisiting Floorplan Transformation ICCV 2017 [project] [code]

Visual Localization

Papers Venue Links
SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments ECCV 2024 [project] [code]
LaLaLoc++: Global Floor Plan Comprehension for Layout Localisation in Unvisited Environments ECCV 2022 [code]
LASER: LAtent SpacE Rendering for 2D Visual Localization CVPR 2022 -
LaLaLoc: Latent Layout Localisation in Dynamic, Unvisited Environments ICCV 2021 -

Primitive

Junction

Papers Venue Links
Manhattan Junction Catalogue for Spatial Reasoning of Indoor Scenes CVPR 2013 -

Line Segment and Wireframe

Papers Venue Links
📷Volumetric Wireframe Parsing from Neural Attraction Fields CoRR 2023 [code]
📷NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images CVPR 2023 [project]
DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients CoRR 2022 [Code]
Holistically-Attracted Wireframe Parsing: From Supervised to Self-Supervised Learning CoRR 2022 -
🎲Learning to Construct 3D Building Wireframes from 3D Line Clouds BMVC 2022 [Code]
HoW-3D: Holistic 3D Wireframe Perception from a Single Image 3DV 2022 [Code]
Semantic Room Wireframe Detection from a Single View ICPR 2022 [code]
Towards Real-time and Light-weight Line Segment Detection AAAI 2022 [code]
Hole-robust Wireframe Detection WACV 2022 -
Fully Convolutional Line Parsing Neurocomputing 2022 [code]
ELSD: Efficient Line Segment Detector and Descriptor ICCV 2021 -
SOLD2: Self-supervised Occlusion-aware Line Description and Detection CVPR 2021 [code]
Line Segment Detection Using Transformers without Edges CVPR 2021 [code]
PlueckerNet: Learn to Register 3D Line Reconstructions CVPR 2020 [code]
LGNN: A Context-aware Line Segment Detector ACM MM 2020 -
TP-LSD: Tri-Points Based Line Segment Detector ECCV 2020 [code]
Deep Hough-Transform Line Priors ECCV 2020 [code]
Deep Hough Transform for Semantic Line Detection ECCV 2020 [code]
Holistically-Attracted Wireframe Parsing CVPR 2020 [code]
Learning to Reconstruct 3D Manhattan Wireframes from a Single Image ICCV 2019 [code]
End-to-End Wireframe Parsing ICCV 2019 [code]
PPGNet: Learning Point-Pair Graph for Line Segment Detection CVPR 2019 [code]
Learning Attraction Field Representation for Robust Line Segment Detection CVPR 2019 [code]
Novel Single View Constraints for Manhattan 3D Line Reconstruction 3DV 2018 -
Learning to Parse Wireframes in Images of Man-Made Environments CVPR 2018 [code]
A Novel Linelet-Based Representation for Line Segment Detection TPAMI 2018 -
MCMLSD: A Dynamic Programming Approach to Line Segment Detection CVPR 2017 -
Lifting 3D Manhattan Lines from a Single Image ICCV 2013 -
LSD: A Fast Line Segment Detector with a False Detection Control TPAMI 2010 -

Outdoor Architecture

Papers Venue Links
HEAT: Holistic Edge Attention Transformer for Structured Reconstruction CVPR 2022 [Project]
Structured Outdoor Architecture Reconsruction by Exploration and Classification ICCV 2021 [Project]
Roof-GAN: Learning to Generate Roof Geometry and Relations for Residential Houses CVPR 2021 [Code]
Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship Inference ECCV 2020 [Project]
Conv-MPN: Convolutional Message Passing Neural Network for Structured Outdoor Architecture Reconstruction CVPR 2020 [Project]

Plane

Papers Venue Links
📷 UniPlane: Unified Plane Detection and Reconstruction from Posed Monocular Videos CoRR 2024
📷 AirPlanes: Accurate Plane Estimation via 3D-Consistent Embeddings CVPR 2024 [project]
PlaneRecTR: Unified Query learning for 3D Plane Recovery from a Single View ICCV 2023 [Code]
📷 NOPE-SAC: Neural One-Plane RANSAC for Sparse-View Planar 3D Reconstruction CoRR 2022 [Code]
📷 PlaneFormers: From Sparse View Planes to 3D Reconstruction ECCV 2022 [project] [code]
📷 PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos CVPR 2022 [Project]
PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image BMVC 2021 [code]
PlaneTR: Structure-Guided Transformers for 3D Plane Recovery ICCV 2021 [code]
📷 Planar Surface Reconstruction From Sparse Views ICCV 2021 [project] [code]
Indoor Panorama Planar 3D Reconstruction via Divide and Conquer CVPR 2021 [code]
Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction ECCV 2020 [code]
Peek-a-Boo: Occlusion Reasoning in Indoor Scenes with Plane Representations CVPR 2020 [project]
Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding CVPR 2019 [code]
PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image CVPR 2019 [project] [code]
Recovering 3D Planes from a Single Image via Convolutional Neural Networks ECCV 2018 [code]
PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image CVPR 2018 [project] [code]

Vanishing Point

Papers Venue Links
Vanishing Point Estimation in Uncalibrated Images with Prior Gravity Direction ICCV 2023 [code]
Transformer Based Line Segment Classifier with Image Context for Real-Time Vanishing Point Detection in Manhattan World CVPR 2022 -
Deep Vanishing Point Detection: Geometric Priors Make Dataset Variations Vanish CVPR 2022 -
VaPiD: A Rapid Vanishing Point Detector via Learned Optimizers ICCV 2021 -
NeurVPS: Neural Vanishing Point Scanning via Conic Convolution NeurIPS 2021 [Code]