/awesome-scene-understanding

😎 A list of papers for scene understanding in computer vision.

Awesome Scene Understanding

A list of papers for scene understanding.

Workshops and Tutorials

  • Holistic Structures for 3D Vision (ICCV'21) [Webpage]

  • Holistic Scene Structures for 3D Vision (ECCV'20) [Webpage] [Challenge]

  • Holistic 3D Reconstruction: Learning to Reconstruct Holistic 3D Structures from Sensorial Data (ICCV'19) [Webpage] [Resources]

Survey

  • State-of-the-art in Automatic 3D Reconstruction of Structured Indoor Environments (Computer Graphics Forum'20) [Paper]

  • Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A Survey (IEEE Access'19) [Paper]

  • RGBD Datasets: Past, Present and Future (CVPR Workshop'16) [Project] [Paper]

Dataset

Realistic Dataset

  • HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures (CoRR'20) [Project] [Paper] [Code]

  • OASIS: A Large-Scale Dataset for Single Image 3D in the Wild (CVPR'20) [Project] [Paper]

  • 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera (ICCV'19) [Project] [Paper]

  • The Replica Dataset: A Digital Replica of Indoor Spaces (CoRR'19) [Paper] [Code]

  • Matterport3D: Learning from RGB-D Data in Indoor Environments (3DV'17) [Project] [Paper] [Code]

  • [2D-3D-S] Joint 2D-3D-Semantic Data for Indoor Scene Understanding (CoRR'17) [Project] [Paper]

  • ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes (CVPR'17) [Project] [Paper]

  • SceneNN: a Scene Meshes Dataset with aNNotations (3DV'16) [Project] [Paper]

  • SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite (CVPR'15) [Project] [Paper]

  • SUN3D: A Database of Big Spaces Reconstructed using SfM and Object Labels (ICCV'13) [Project] [Paper]

  • [NYUv2] Indoor Segmentation and Support Inference from RGBD Images (ECCV'12) [Project] [Paper]

Synthetic Dataset

  • MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis (CoRR'21) [Project] [Paper]

  • 3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics (ICCV'21) [Project] [Paper] [Code] [Rendering Tool]

  • Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding (ICCV'21) [Project] [Paper] [Code]

  • OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets (CVPR'21) [Project] [Paper]

  • Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling (ECCV'20) [Project] [Paper] [Code]

  • InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset (BMVC'18) [Project] [Paper]

  • SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation? (ICCV'17) [Project] [Paper]

  • [SUNCG] Semantic Scene Completion from a Single Depth Image (CVPR'17) [Paper]

  • SceneNet: Understanding Real World Indoor Scenes With Synthetic Data (CVPR'16) [Project] [Paper]

  • The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes (CVPR'16) [Project] [Paper]

Holistic Scene Understanding

Perspective Image

  • Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR'21) [Paper]

  • Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image (CVPR'20) [Paper] [Code]

  • 3D Object Detection from a Single RGB Image via Perspective Points (NIPS'19) [Paper]

  • Hoilistc++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense (ICCV'19) [Project] [Paper] [Code]

  • Complete 3D Scene Parsing from an RGBD Image (IJCV'18) [Paper]

  • Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation (NeurIPS'18) [Project] [Paper] [Code]

  • Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image (ECCV'18) [Project] [Paper] [Code]

  • Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene (CVPR'18) [Project] [Paper] [Code]

  • Im2CAD (CVPR'18) [Project] [Paper]

  • DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding (ICCV'17) [Project] [Paper]

  • Emptying, Refurnishing, and Relighting Indoor Spaces (SIGGRAPH Asia'16) [Project] [Paper]

  • Scene Parsing by Integrating Function, Geometry and Appearance Models (CVPR'13) [Project] [Paper]

  • Understanding Indoor Scenes using 3D Geometric Phrases (CVPR'13) [Paper]

  • Recovering Free Space of Indoor Scenes from a Single Image (CVPR'12) [Paper]

  • Efficient Exact Inference for 3D Indoor Scene Understanding (ECCV'12) [Paper]

  • Efficient Structured Prediction for 3D Indoor Scene Understanding (CVPR'12) [Paper]

  • Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces (NeurIPS'10) [Paper]

  • Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry (ECCV'10) [Paper]

Panoramic Image

  • DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV'21) [Paper] [Code]

  • Automatic 3D Indoor Scene Modeling from Single Panorama (CVPR'18) [Paper]

  • Pano2CAD: Room Layout From A Single Panorama Image (WACV'17) [Paper]

  • PanoContext: A Whole-room 3D Context Model for Panoramic Scene Understanding (ECCV'14) [Project] [Paper]

Room Layout Estimation

Perspective Image

(AW: Atlanta-world, SS: single-floor and single-ceiling, PP: Piece-wise Planarity.)

Dataset Modality #Frames Prior Source
Hedau (ICCV'09) RGB 366 Cuboid -
NYUv2 303 (ICCV'13) RGB-D 303 Cuboid NYUv2
LSUN Room Layout (2016) RGB 5394 Cuboid SUN
SUN RGB-D (CVPR'15) RGB-D 10335 AW+SS NYUv2, Berkeley B3DO, and SUN3D
ScanNet-Layout (ECCV'20) RGB-D 293 PP ScanNet
Matterport3D-Layout (ECCV'20) RGB-D 7360 PP Matterport
Structured3D (ECCV'20) RGB-D 82027 AW+SS Structured3D
  • Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image (WACV'22) [Paper] [Code]

  • RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single View (CoRR'21) [Paper]

  • GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes (ECCV'20) [Paper] [Matterport3D Layout Dataset]

  • Structural Deep Metric Learning for Room Layout Estimation (ECCV'20) [Paper]

  • General 3D Room Layout from a Single View by Render-and-Compare (ECCV'20) [Project] [Paper] [ScanNet-Layout Dataset] [Code]

  • Smart Hypothesis Generation for Efficient and Robust Room Layout Estimation (WACV'20) [Paper]

  • Flat2Layout: Flat Representation for Estimating Layout of General Room Types (CoRR'19) [Paper]

  • Thinking Outside the Box: Generation of Unconstrained 3D Room Layouts (ACCV'18) [Paper]

  • RoomNet: End-to-End Room Layout Estimation (ICCV'17) [Paper]

  • Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation (CVPR'17) [Project] [Paper]

  • A Coarse-to-Fine Indoor Layout Estimation (CFILE) Method (ACCV'16) [Paper]

  • DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes (CVPR'16) [Paper]

  • Learning Informative Edge Maps for Indoor Scene Layout Prediction (ICCV'15) [Homepage] [Paper]

  • Rent3D: Floor-Plan Priors for Monocular Layout Estimation (CVPR'15) [Project] [Paper]

  • Box In the Box: Joint 3D Layout and Object Reasoning from Single Images (CVPR'13) [Paper]

  • Estimating the 3D Layout of Indoor Scenes and its Clutter from Depth Sensors (ICCV'13) [Project] [Paper]

  • Recovering the Spatial Layout of Cluttered Rooms (ICCV'09) [Paper]

Panoramic Image

(MW: Manhattan world, AW: Atlanta world, SS: single-floor and single-ceiling.)

Dataset Modality #Frames Prior Source
PanoContext (ECCV'14) RGB 500 Cuboid SUN360
2D-3D-S (CVPR'18) RGB-D 571 Cuboid 2D-3D-S
MatterportLayout (2020) RGB-D 2295 MW+SS Matterport
LayoutMP3D (2020) RGB-D 2505 MW+SS Matterport
Structured3D (ECCV'20) RGB-D 196515 AW+SS Structured3D
ZInD (CVPR'21) RGB 71474 AW+SS ZinD
  • Transferable End-to-end Room Layout Estimation via Implicit Encoding (CoRR'21) [Paper] [Project]

  • Zillow Indoor Dataset: Annotated Floor Plans With 360Ëš Panoramas and 3D Room Layouts (CVPR'21) [Paper] [Code]

  • OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas (CVPR Workshop'21) [Paper] [Code]

  • LED2-Net: Monocular 360Ëš Layout Estimation via Differentiable Depth Rendering (CVPR'21) [Project] [Paper] [Code]

  • SSLayout360: Semi-Supervised Indoor Layout Estimation from 360 Panorama (CVPR'21) [Paper]

  • Single-Shot Cuboids: Geodesics-based End-to-end Manhattan Aligned Layout Estimation from Spherical Panoramas (CoRR'21) [Project] [Paper] [Code]

  • Manhattan Room Layout Reconstruction from a Single 360 image: A Comparative Study of State-of-the-art Methods (IJCV'21) [Paper] [Code] [MatterportLayout Dataset]

  • Training and Post Processing 3D Room Layout Beyond the Manhattan World Assumption (ECCV Workshop'20) [Paper]

  • Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image (ECCV'20) [Paper]

  • AtlantaNet: Inferring the 3D Indoor Layout from a Single 360 Image Beyond the Manhattan World Assumption (ECCV'20) [Project] [Paper] [Code]

  • Corners for Layout: End-to-End Layout Recovery from 360 Images (ICRA'19) [Project] [Paper] [Code]

  • DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama (CVPR'19) [Project] [Paper]

  • HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation (CVPR'19) [Paper] [Code]

  • Layouts from Panoramic Images with Geometry and Deep Learning (IROS'18) [Paper] [Code]

  • LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image (CVPR'18) [Paper] [Code]

  • Efficient 3D Room Shape Recovery From a Single Panorama (CVPR'16) [Project] [Paper] [Code]

Floorplan

Point Cloud

  • MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans (ICCV'21) [Paper]

  • Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes (CoRR'20) [Paper]

  • Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path (ICCV'19) [Project] [Paper] [Code]

  • FloorNet: A unified framework for floorplan reconstruction from 3D scans (ECCV'18) [Project] [Paper] [Code]

Multi-view

  • Extreme Structure From Motion for Indoor Panoramas Without Visual Overlaps (ICCV'21) [Paper] [Code]

  • Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial Scans (ICCV'19) [Project] [Paper]

  • DeepPerimeter: Indoor Boundary Estimation from Posed Monocular Sequences (CoRR'19) [Paper]

Image

  • Residential floor plan recognition and reconstruction (CVPR'21) [Paper]

  • Versailles-FP dataset: Wall Detection in Ancient Floor Plans (CoRR'21) [Paper]

  • HouseExpo: A Large-scale 2D Indoor Layout Dataset for Learning-based Algorithms on Mobile Robots (IROS'20) [Paper] [Code]

  • Deep Floor Plan Recognition using a Multi-task Network with Room-boundary-Guided Attention (ICCV'19) [Project] [Paper]

  • CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image Analysis (CoRR'19) [Paper] [Code]

  • Raster-to-Vector: Revisiting Floorplan Transformation (ICCV'17) [Project] [Paper] [Code]

Primitive Detection

Junction

  • Manhattan Junction Catalogue for Spatial Reasoning of Indoor Scenes (CVPR'13) [Paper]

Line Segment and Wireframe

Please refer to Wireframe to see more comprehensive review.

  • Towards Real-time and Light-weight Line Segment Detection (AAAI'22) [Paper] [Code]

  • Hole-robust Wireframe Detection (WACV'22) [Paper]

  • Fully Convolutional Line Parsing (CoRR'21) [Paper] [Code]

  • ELSD: Efficient Line Segment Detector and Descriptor (ICCV'21) [Paper]

  • SOLD²: Self-supervised Occlusion-aware Line Description and Detection (CVPR'21) [Paper] [Code]

  • Line Segment Detection Using Transformers without Edges (CVPR'21) [Paper]

  • PlueckerNet: Learn to Register 3D Line Reconstructions (CVPR'20) [Paper] [Code]

  • LGNN: A Context-aware Line Segment Detector (ACM MM'20) [Paper]

  • TP-LSD: Tri-Points Based Line Segment Detector (ECCV'20) [Paper]

  • Deep Hough-Transform Line Priors (ECCV'20) [Paper] [Code]

  • Deep Hough Transform for Semantic Line Detection (ECCV'20) [Paper] [Code]

  • Holistically-Attracted Wireframe Parsing (CVPR'20) [Paper] [Code]

  • Learning to Reconstruct 3D Manhattan Wireframes from a Single Image (ICCV'19) [Paper] [Code]

  • End-to-End Wireframe Parsing (ICCV'19) [Paper] [Code]

  • PPGNet: Learning Point-Pair Graph for Line Segment Detection (CVPR'19) [Paper] [Code]

  • Learning Attraction Field Representation for Robust Line Segment Detection (CVPR'19) [Paper] [Code]

  • Novel Single View Constraints for Manhattan 3D Line Reconstruction (3DV'18) [Paper]

  • Learning to Parse Wireframes in Images of Man-Made Environments (CVPR'18) [Paper] [Code]

  • A Novel Linelet-Based Representation for Line Segment Detection (TPAMI'18) [Paper]

  • MCMLSD: A Dynamic Programming Approach to Line Segment Detection (CVPR'17) [Paper]

  • Lifting 3D Manhattan Lines from a Single Image (ICCV'15) [Paper]

  • LSD: A Fast Line Segment Detector with a False Detection Control (TPAMI'10) [Paper]

Plane

  • PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image (BMVC'21) [Paper] [Code]

  • PlaneTR: Structure-Guided Transformers for 3D Plane Recovery (ICCV'21) [Paper] [Code]

  • Planar Surface Reconstruction From Sparse Views (ICCV'21) [Project] [Paper] [Code]

  • Indoor Panorama Planar 3D Reconstruction via Divide and Conquer (CVPR'21) [Paper] [Code]

  • Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction (ECCV'20) [Paper] [Code]

  • Peek-a-Boo: Occlusion Reasoning in Indoor Scenes with Plane Representations (CVPR'20) [Project] [Paper]

  • Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding (CVPR'19) [Paper] [Code]

  • PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image (CVPR'19) [Project] [Paper] [Code]

  • Recovering 3D Planes from a Single Image via Convolutional Neural Networks (ECCV'18) [Paper] [Code]

  • PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image (CVPR'18) [Project] [Paper] [Code]

Cuboid

  • Deep Cuboid Detection: Beyond 2D Bounding Boxes (CoRR'16) [Paper]

  • A Linear Approach to Matching Cuboids in RGBD Images (CVPR'13) [Project] [Paper]

  • Localizing 3D Cuboids in Single-view Images (NIPS'12) [Paper]

Others

  • Bottom-Up/Top-Down Image Parsing with Attribute Grammar (TPAMI'09) [Paper]

  • Detection and Matching of Rectilinear Structures (CVPR'08) [Paper]