Awesome-BEV-Perception: A repository from autodriving-heart

Awesome-BEV-Perception

本仓库由公众号【自动驾驶之心】团队整理，欢迎关注，一览最前沿的技术分享！

自动驾驶之心是国内首个自动驾驶开发者社区！这里有最全面有效的自动驾驶与AI学习路线（感知/定位/融合）和自动驾驶与AI公司内推机会！

一、Overview

1. A review of BEV-based 3D target detection

Vision-Centric BEV Perception: A Survey

2. BEV Perception Update Roundup

Delving into the Devils of Bird’s-eye-view Perception: A Review, Evaluation and Recipe

[Code]

3. A review of vision-radar fusion for BEV detection

Vision-RADAR fusion for Robotics BEV Detections: A Survey

4. A review of 3D target detection for self-driving surround view

Surround-View Vision-based 3D Detection for Autonomous Driving: A Survey

二、Camera-based BEV

1. List of camera-based BEV sensing methods

Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

project

BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View

project

BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection

project

BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection

project

DSGN: Deep Stereo Geometry Network for 3D Object Detection

project

LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-Based 3D Detector

project

Is Pseudo-Lidar Needed for Monocular 3D Object Detection?

project

Inverse perspective mapping simplifies optical flow computation and obstacle detection

Deep Learning based Vehicle Position and Orientation Estimation via Inverse Perspective Mapping Image

Learning to Map Vehicles into Bird’s Eye View

Monocular 3D Vehicle Detection Using Uncalibrated Traffic Cameras through Homography

Driving Among Flatmobiles: Bird-Eye-View Occupancy Grids From a Monocular Camera for Holistic Trajectory Planning

Understanding Bird’s-Eye View of Road Semantics Using an Onboard Camera

project

Automatic dense visual semantic mapping from street-level imagery

Stacked Homography Transformations for Multi-View Pedestrian Detection

Cross-View Semantic Segmentation for Sensing Surroundings

project

FISHING Net: Future Inference of Semantic Heatmaps In Grids

NEAT: Neural Attention Fields for End-to-End Autonomous Driving

project

Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-View Transformation

project

Bird’s-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

project

BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers

project

PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark

project

PETR: Position Embedding Transformation for Multi-View 3D Object Detection

project

DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries

project

Translating Images into Maps

project

GitNet: Geometric Prior-based Transformation for Birds-Eye-View Segmentation

PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

project

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection supplemental

project

MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbones

project

FIERY: Future Instance Prediction in Bird's-Eye View From Surround Monocular Cameras

BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving

project

2. Estimation of BEV Appearance and Occupancy Information Based on Surrounding Monocular Images

Estimation of Appearance and Occupancy Information in Bird’s EyeView from Surround Monocular Images

[Code]

###3. BEV representation of de-camera parameters

Multi-Camera Calibration Free BEV Representation for 3D Object Detection

4.BEVFormerV2

BEVFormer v2: Adapting Modern Image Backbones to Bird’s-Eye-View Recognition via Perspective Supervision

5.无预标定的相机进行多视图subject registration

From a Bird’s Eye View to See: Joint Camera and Subject Registration without the Camera Calibration

三、LiDAR-based BEV

1. List of LiDAR-based BEV sensing methods

VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection

SECOND: Sparsely Embedded Convolutional Detection

project

Center-Based 3D Object Detection and Tracking

project

PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection

project

PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection

project

Structure Aware Single-Stage 3D Object Detection From Point Cloud

project

Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

project

Object DGCNN: 3D Object Detection using Dynamic Graphs

Voxel Transformer for 3D Object Detection

Embracing Single Stride 3D Object Detector With Sparse Transformer / paper / supplemental

project

AFDetV2: Rethinking the Necessity of the Second Stage for Object Detection from Point Clouds

PointPillars: Fast Encoders for Object Detection From Point Clouds

2. Point cloud-based pre-training framework

BEV-MAE: Bird's Eye View Masked Autoencoders for Outdoor Point Cloud Pre-training

3. BEV-SAN: Accurate BEV 3D target detection using slicing attention

BEV-SAN: Accurate BEV 3D Object Detection via Slice Attention Networks

4. 2D target detection and LiDAR joint training (2.5D points)

Objects as Spatio-Temporal 2.5D points

5.BEV-LGKD：统一框架针对BEV 3D目标检测任务应用知识蒸馏

BEV-LGKD: A Unified LiDAR-Guided Knowledge Distillation Framework for BEV 3D Object Detection

四、BEV Fusion

1. List of BEV fusion methods

Unifying Voxel-based Representation with Transformer for 3D Object Detection

project

MVFuseNet: Improving End-to-End Object Detection and Motion Forecasting Through Multi-View Fusion of LiDAR Data

UniFormer: Unified Multi-view Fusion Transformer for Spatial-Temporal Representation in Bird's-Eye-View

BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

project

BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework

project

2.BEV feature fusion improvement with X-Align camera and LiDAR

X-Align Cross-Modal Cross-View Alignment for Bird’s-Eye-View Segmentation

3. Radar and LiDAR BEV fusion system

RaLiBEV: Radar and LiDAR BEV Fusion Learning for Anchor Box Free Object Detection System

4. Summary of multimodal fusion methods under BEV

PointPainting: Sequential Fusion for 3D Object Detection (CVPR'19)

project

3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection (ECCV'20)

project

FUTR3D: A Unified Sensor Fusion Framework for 3D Detection (Arxiv'22)

project

MVP: Multimodal Virtual Point 3D Detection (NIPS'21)

project

PointAugmenting: Cross-Modal Augmentation for 3D Object Detection

project

FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection

project

Unifying Voxel-based Representation with Transformer for 3D Object Detection

project

TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers

project

AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection

project

AutoAlignV2: Deformable Feature Aggregation for Dynamic Multi-Modal 3D Object Detection

project

CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection

project

MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection

project

五、Summary of multi-task learning methods under BEV

FIERY: Future Instance Prediction in Bird’s-Eye View from Surround Monocular Cameras

project

StretchBEV: Stretching Future Instance Prediction Spatially and Temporally

project

BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving

project

M^2^BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Bird’s-Eye View Representation

project

STSU: Structured Bird’s-Eye-View Traffic Scene Understanding from Onboard Images

project

BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers

project

Ego3RT: Learning Ego 3D Representation as Ray Tracing

project

PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

project

PolarFormer: Multi-camera 3D Object Detection with Polar Transformers

project

六、PV2BEV method summary

1. Summary of PV2BEV based on depth method

OFT: Orthographic Feature Transform for Monocular 3D Object Detection

project

CaDDN: Categorical Depth Distribution Network for Monocular 3D Object Detection

project

DSGN: Deep Stereo Geometry Network for 3D Object Detection

project

Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

project

PanopticSeg: Bird’s-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

project

FIERY: Future Instance Prediction in Bird’s-Eye View from Surround Monocular Cameras

project

LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector

project

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

project

BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View

project

M^2^BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Bird’s-Eye View Representation

project

StretchBEV: Stretching Future Instance Prediction Spatially and Temporally

project

DfM: Monocular 3D Object Detection with Depth from Motion

project

BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection

project

BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving

project

MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbones

project

Putting People in their Place: Monocular Regression of 3D People in Depth

Code Project Video RH Dataset

2. Summary of PV2BEV methods based on Hough transform

IPM: Inverse perspective mapping simplifies optical flow computation and obstacle detection

DSM: Automatic Dense Visual Semantic Mapping from Street-Level Imagery

MapV: Learning to map vehicles into bird’s eye view

BridgeGAN: Generative Adversarial Frontal View to Bird View Synthesis

project

VPOE: Deep learning based vehicle position and orientation estimation via inverse perspective mapping image

3D-LaneNet: End-to-End 3D Multiple Lane Detection

The Right (Angled) Perspective: Improving the Understanding of Road Scenes Using Boosted Inverse Perspective Mapping

Cam2BEV: A Sim2Real Deep Learning Approach for the Transformation of Images from Multiple Vehicle-Mounted Cameras to a Semantically Segmented Image in Bird’s Eye View

project

MonoLayout: Amodal Scene Layout from a Single Image

project

MVNet: Multiview Detection with Feature Perspective Transformation

project

OGMs: Driving among Flatmobiles: Bird-Eye-View occupancy grids from a monocular camera for holistic trajectory planning

project

TrafCam3D: Monocular 3D Vehicle Detection Using Uncalibrated Traffic Camerasthrough Homography

project

SHOT:Stacked Homography Transformations for Multi-View Pedestrian Detection

HomoLoss: Homography Loss for Monocular 3D Object Detection

autodriving-heart/Awesome-BEV-Perception