Awesome Scene Understanding

A list of papers for scene understanding.

Workshops and Tutorials

Holistic Structures for 3D Vision (ICCV'21) [Webpage]
Holistic Scene Structures for 3D Vision (ECCV'20) [Webpage] [Challenge]
Holistic 3D Reconstruction: Learning to Reconstruct Holistic 3D Structures from Sensorial Data (ICCV'19) [Webpage] [Resources]

Survey

State-of-the-art in Automatic 3D Reconstruction of Structured Indoor Environments (Computer Graphics Forum'20) [Paper]
Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A Survey (IEEE Access'19) [Paper]
RGBD Datasets: Past, Present and Future (CVPR Workshop'16) [Project] [Paper]

Dataset

Realistic Dataset

HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures (CoRR'20) [Project] [Paper] [Code]
OASIS: A Large-Scale Dataset for Single Image 3D in the Wild (CVPR'20) [Project] [Paper]
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera (ICCV'19) [Project] [Paper]
The Replica Dataset: A Digital Replica of Indoor Spaces (CoRR'19) [Paper] [Code]
Matterport3D: Learning from RGB-D Data in Indoor Environments (3DV'17) [Project] [Paper] [Code]
[2D-3D-S] Joint 2D-3D-Semantic Data for Indoor Scene Understanding (CoRR'17) [Project] [Paper]
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes (CVPR'17) [Project] [Paper]
SceneNN: a Scene Meshes Dataset with aNNotations (3DV'16) [Project] [Paper]
SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite (CVPR'15) [Project] [Paper]
SUN3D: A Database of Big Spaces Reconstructed using SfM and Object Labels (ICCV'13) [Project] [Paper]
[NYUv2] Indoor Segmentation and Support Inference from RGBD Images (ECCV'12) [Project] [Paper]

Synthetic Dataset

MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis (CoRR'21) [Project] [Paper]
3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics (ICCV'21) [Project] [Paper] [Code] [Rendering Tool]
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding (ICCV'21) [Project] [Paper] [Code]
OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets (CVPR'21) [Project] [Paper]
Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling (ECCV'20) [Project] [Paper] [Code]
InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset (BMVC'18) [Project] [Paper]
SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation? (ICCV'17) [Project] [Paper]
[SUNCG] Semantic Scene Completion from a Single Depth Image (CVPR'17) [Paper]
SceneNet: Understanding Real World Indoor Scenes With Synthetic Data (CVPR'16) [Project] [Paper]
The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes (CVPR'16) [Project] [Paper]

Holistic Scene Understanding

Perspective Image

Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR'21) [Paper]
Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image (CVPR'20) [Paper] [Code]
3D Object Detection from a Single RGB Image via Perspective Points (NIPS'19) [Paper]
Hoilistc++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense (ICCV'19) [Project] [Paper] [Code]
Complete 3D Scene Parsing from an RGBD Image (IJCV'18) [Paper]
Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation (NeurIPS'18) [Project] [Paper] [Code]
Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image (ECCV'18) [Project] [Paper] [Code]
Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene (CVPR'18) [Project] [Paper] [Code]
Im2CAD (CVPR'18) [Project] [Paper]
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding (ICCV'17) [Project] [Paper]
Emptying, Refurnishing, and Relighting Indoor Spaces (SIGGRAPH Asia'16) [Project] [Paper]
Scene Parsing by Integrating Function, Geometry and Appearance Models (CVPR'13) [Project] [Paper]
Understanding Indoor Scenes using 3D Geometric Phrases (CVPR'13) [Paper]
Recovering Free Space of Indoor Scenes from a Single Image (CVPR'12) [Paper]
Efficient Exact Inference for 3D Indoor Scene Understanding (ECCV'12) [Paper]
Efficient Structured Prediction for 3D Indoor Scene Understanding (CVPR'12) [Paper]
Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces (NeurIPS'10) [Paper]
Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry (ECCV'10) [Paper]

Panoramic Image

DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV'21) [Paper] [Code]
Automatic 3D Indoor Scene Modeling from Single Panorama (CVPR'18) [Paper]
Pano2CAD: Room Layout From A Single Panorama Image (WACV'17) [Paper]
PanoContext: A Whole-room 3D Context Model for Panoramic Scene Understanding (ECCV'14) [Project] [Paper]

Room Layout Estimation

Perspective Image

(AW: Atlanta-world, SS: single-floor and single-ceiling, PP: Piece-wise Planarity.)

Dataset	Modality	#Frames	Prior	Source
Hedau (ICCV'09)	RGB	366	Cuboid	-
NYUv2 303 (ICCV'13)	RGB-D	303	Cuboid	NYUv2
LSUN Room Layout (2016)	RGB	5394	Cuboid	SUN
SUN RGB-D (CVPR'15)	RGB-D	10335	AW+SS	NYUv2, Berkeley B3DO, and SUN3D
ScanNet-Layout (ECCV'20)	RGB-D	293	PP	ScanNet
Matterport3D-Layout (ECCV'20)	RGB-D	7360	PP	Matterport
Structured3D (ECCV'20)	RGB-D	82027	AW+SS	Structured3D

Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image (WACV'22) [Paper] [Code]
RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single View (CoRR'21) [Paper]
GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes (ECCV'20) [Paper] [Matterport3D Layout Dataset]
Structural Deep Metric Learning for Room Layout Estimation (ECCV'20) [Paper]
General 3D Room Layout from a Single View by Render-and-Compare (ECCV'20) [Project] [Paper] [ScanNet-Layout Dataset] [Code]
Smart Hypothesis Generation for Efficient and Robust Room Layout Estimation (WACV'20) [Paper]
Flat2Layout: Flat Representation for Estimating Layout of General Room Types (CoRR'19) [Paper]
Thinking Outside the Box: Generation of Unconstrained 3D Room Layouts (ACCV'18) [Paper]
RoomNet: End-to-End Room Layout Estimation (ICCV'17) [Paper]
Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation (CVPR'17) [Project] [Paper]
A Coarse-to-Fine Indoor Layout Estimation (CFILE) Method (ACCV'16) [Paper]
DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes (CVPR'16) [Paper]
Learning Informative Edge Maps for Indoor Scene Layout Prediction (ICCV'15) [Homepage] [Paper]
Rent3D: Floor-Plan Priors for Monocular Layout Estimation (CVPR'15) [Project] [Paper]
Box In the Box: Joint 3D Layout and Object Reasoning from Single Images (CVPR'13) [Paper]
Estimating the 3D Layout of Indoor Scenes and its Clutter from Depth Sensors (ICCV'13) [Project] [Paper]
Recovering the Spatial Layout of Cluttered Rooms (ICCV'09) [Paper]

Panoramic Image

(MW: Manhattan world, AW: Atlanta world, SS: single-floor and single-ceiling.)

Dataset	Modality	#Frames	Prior	Source
PanoContext (ECCV'14)	RGB	500	Cuboid	SUN360
2D-3D-S (CVPR'18)	RGB-D	571	Cuboid	2D-3D-S
MatterportLayout (2020)	RGB-D	2295	MW+SS	Matterport
LayoutMP3D (2020)	RGB-D	2505	MW+SS	Matterport
Structured3D (ECCV'20)	RGB-D	196515	AW+SS	Structured3D
ZInD (CVPR'21)	RGB	71474	AW+SS	ZinD

Transferable End-to-end Room Layout Estimation via Implicit Encoding (CoRR'21) [Paper] [Project]
Zillow Indoor Dataset: Annotated Floor Plans With 360˚ Panoramas and 3D Room Layouts (CVPR'21) [Paper] [Code]
OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas (CVPR Workshop'21) [Paper] [Code]
LED²-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering (CVPR'21) [Project] [Paper] [Code]
SSLayout360: Semi-Supervised Indoor Layout Estimation from 360 Panorama (CVPR'21) [Paper]
Single-Shot Cuboids: Geodesics-based End-to-end Manhattan Aligned Layout Estimation from Spherical Panoramas (CoRR'21) [Project] [Paper] [Code]
Manhattan Room Layout Reconstruction from a Single 360 image: A Comparative Study of State-of-the-art Methods (IJCV'21) [Paper] [Code] [MatterportLayout Dataset]
Training and Post Processing 3D Room Layout Beyond the Manhattan World Assumption (ECCV Workshop'20) [Paper]
Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image (ECCV'20) [Paper]
AtlantaNet: Inferring the 3D Indoor Layout from a Single 360 Image Beyond the Manhattan World Assumption (ECCV'20) [Project] [Paper] [Code]
Corners for Layout: End-to-End Layout Recovery from 360 Images (ICRA'19) [Project] [Paper] [Code]
DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama (CVPR'19) [Project] [Paper]
HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation (CVPR'19) [Paper] [Code]
Layouts from Panoramic Images with Geometry and Deep Learning (IROS'18) [Paper] [Code]
LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image (CVPR'18) [Paper] [Code]
Efficient 3D Room Shape Recovery From a Single Panorama (CVPR'16) [Project] [Paper] [Code]

Floorplan

Point Cloud

MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans (ICCV'21) [Paper]
Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes (CoRR'20) [Paper]
Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path (ICCV'19) [Project] [Paper] [Code]
FloorNet: A unified framework for floorplan reconstruction from 3D scans (ECCV'18) [Project] [Paper] [Code]

Multi-view

Extreme Structure From Motion for Indoor Panoramas Without Visual Overlaps (ICCV'21) [Paper] [Code]
Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial Scans (ICCV'19) [Project] [Paper]
DeepPerimeter: Indoor Boundary Estimation from Posed Monocular Sequences (CoRR'19) [Paper]

Image

Residential floor plan recognition and reconstruction (CVPR'21) [Paper]
Versailles-FP dataset: Wall Detection in Ancient Floor Plans (CoRR'21) [Paper]
HouseExpo: A Large-scale 2D Indoor Layout Dataset for Learning-based Algorithms on Mobile Robots (IROS'20) [Paper] [Code]
Deep Floor Plan Recognition using a Multi-task Network with Room-boundary-Guided Attention (ICCV'19) [Project] [Paper]
CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image Analysis (CoRR'19) [Paper] [Code]
Raster-to-Vector: Revisiting Floorplan Transformation (ICCV'17) [Project] [Paper] [Code]

Primitive Detection

Junction

Manhattan Junction Catalogue for Spatial Reasoning of Indoor Scenes (CVPR'13) [Paper]

Line Segment and Wireframe

Please refer to Wireframe to see more comprehensive review.

Towards Real-time and Light-weight Line Segment Detection (AAAI'22) [Paper] [Code]
Hole-robust Wireframe Detection (WACV'22) [Paper]
Fully Convolutional Line Parsing (CoRR'21) [Paper] [Code]
ELSD: Efficient Line Segment Detector and Descriptor (ICCV'21) [Paper]
SOLD²: Self-supervised Occlusion-aware Line Description and Detection (CVPR'21) [Paper] [Code]
Line Segment Detection Using Transformers without Edges (CVPR'21) [Paper]
PlueckerNet: Learn to Register 3D Line Reconstructions (CVPR'20) [Paper] [Code]
LGNN: A Context-aware Line Segment Detector (ACM MM'20) [Paper]
TP-LSD: Tri-Points Based Line Segment Detector (ECCV'20) [Paper]
Deep Hough-Transform Line Priors (ECCV'20) [Paper] [Code]
Deep Hough Transform for Semantic Line Detection (ECCV'20) [Paper] [Code]
Holistically-Attracted Wireframe Parsing (CVPR'20) [Paper] [Code]
Learning to Reconstruct 3D Manhattan Wireframes from a Single Image (ICCV'19) [Paper] [Code]
End-to-End Wireframe Parsing (ICCV'19) [Paper] [Code]
PPGNet: Learning Point-Pair Graph for Line Segment Detection (CVPR'19) [Paper] [Code]
Learning Attraction Field Representation for Robust Line Segment Detection (CVPR'19) [Paper] [Code]
Novel Single View Constraints for Manhattan 3D Line Reconstruction (3DV'18) [Paper]
Learning to Parse Wireframes in Images of Man-Made Environments (CVPR'18) [Paper] [Code]
A Novel Linelet-Based Representation for Line Segment Detection (TPAMI'18) [Paper]
MCMLSD: A Dynamic Programming Approach to Line Segment Detection (CVPR'17) [Paper]
Lifting 3D Manhattan Lines from a Single Image (ICCV'15) [Paper]
LSD: A Fast Line Segment Detector with a False Detection Control (TPAMI'10) [Paper]

Plane

PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image (BMVC'21) [Paper] [Code]
PlaneTR: Structure-Guided Transformers for 3D Plane Recovery (ICCV'21) [Paper] [Code]
Planar Surface Reconstruction From Sparse Views (ICCV'21) [Project] [Paper] [Code]
Indoor Panorama Planar 3D Reconstruction via Divide and Conquer (CVPR'21) [Paper] [Code]
Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction (ECCV'20) [Paper] [Code]
Peek-a-Boo: Occlusion Reasoning in Indoor Scenes with Plane Representations (CVPR'20) [Project] [Paper]
Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding (CVPR'19) [Paper] [Code]
PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image (CVPR'19) [Project] [Paper] [Code]
Recovering 3D Planes from a Single Image via Convolutional Neural Networks (ECCV'18) [Paper] [Code]
PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image (CVPR'18) [Project] [Paper] [Code]

Cuboid

Deep Cuboid Detection: Beyond 2D Bounding Boxes (CoRR'16) [Paper]
A Linear Approach to Matching Cuboids in RGBD Images (CVPR'13) [Project] [Paper]
Localizing 3D Cuboids in Single-view Images (NIPS'12) [Paper]

Others

Bottom-Up/Top-Down Image Parsing with Attribute Grammar (TPAMI'09) [Paper]
Detection and Matching of Rectilinear Structures (CVPR'08) [Paper]

lunalulu/awesome-scene-understanding

Awesome Scene Understanding

Workshops and Tutorials

Survey

Dataset

Realistic Dataset

Synthetic Dataset

Holistic Scene Understanding

Perspective Image

Panoramic Image

Room Layout Estimation

Perspective Image

Panoramic Image

Floorplan

Point Cloud

Multi-view

Image

Primitive Detection

Junction

Line Segment and Wireframe

Plane

Cuboid

Others