autodriving-heart/CVPR2023-Papers-autonomous-driving

CVPR 2024 Papers Autonomous Driving

CVPR2023-Papers-autonomous-driving

CVPR2023中稿paper已经陆续放出来了，自动驾驶之心团队为大家整理了计算机视觉、BEV、分割、Occpuancy、vit、SLAM、Few-Shot/Zero-Shot、点云处理、自动驾驶等多个方向的内容，后面将会持续更新....

作者：汽车人 | 自动驾驶之心->：【技术交流群】微信公众平台 (qq.com)

点击关注 @自动驾驶之心第一时间看到最前沿与价值的CV/自动驾驶/AI类工作!

强烈推荐！自动驾驶与AI学习社区：欢迎加入国内首个自动驾驶开发者社区！这里有最全面有效的自动驾驶与AI学习路线（感知/定位/融合）和自动驾驶与AI公司内推机会！

3D目标检测

1.Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View

2.MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection

3.Weakly Supervised Monocular 3D Object Detection using Multi-View Projection and Direction Consistency

4.Uni3D: A Unified Baseline for Multi-dataset 3D Object Detection

5.Virtual Sparse Convolution for Multimodal 3D Object Detection

6.X3KD: Knowledge Distillation Across Modalities, Tasks and Stages for Multi-Camera 3D Object Detection

7.3D Video Object Detection with Learnable Object-Centric Global Optimization

8.CAPE: Camera View Position Embedding for Multi-View 3D Object Detection

9.Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection

10.AeDet: Azimuth-invariant Multi-view 3D Object Detection

11.Hierarchical Supervision and Shuffle Data Augmentation for 3D Semi-Supervised Object Detection

12.LinK: Linear Kernel for LiDAR-based 3D Perception

13.CAPE: Camera View Position Embedding for Multi-View 3D Object Detection

14.PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection

15.LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion

BEV感知

1.Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View

2.Understanding the Robustness of 3D Object Detection with Bird's-Eye-View Representations in Autonomous Driving

3.TBP-Former: Learning Temporal Bird's-Eye-View Pyramid for Joint Perception and Prediction in Vision-Centric Autonomous Driving

Occpuancy

1.Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction

分割相关

1.Delivering Arbitrary-Modal Semantic Segmentation

2.Token Contrast for Weakly-Supervised Semantic Segmentation

3.ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution

4.Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation

5.MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving

6.FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation

7.InstMove: Instance Motion for Object-centric Video Segmentation

8.MobileVOS: Real-Time Video Object Segmentation Contrastive Learning meets Knowledge Distillation

9.MP-Former: Mask-Piloted Transformer for Image Segmentation

10.Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos

11.LaserMix for Semi-Supervised LiDAR Semantic Segmentation

12.Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks

13.EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene Supervision

14.Generative Semantic Segmentation

15.DynaMask: Dynamic Mask Selection for Instance Segmentation

16.Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation

17.Exploiting the Complementarity of 2D and 3D Networks to Address Domain-Shift in 3D Semantic Segmentation

18.DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic Segmentation

19.3D Semantic Segmentation in the Wild: Learning Generalized Models for Adverse-Condition Point Clouds

20.Generative Semantic Segmentation

SLAM

1.Renderable Neural Radiance Map for Visual Navigation

2.PVO: Panoptic Visual Odometry

Transformer

1.Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves

2.Reversible Vision Transformers

3.BiFormer: Vision Transformer with Bi-Level Routing Attention

4.PVO: Panoptic Visual Odometry

Few-Shot/Zero-Shot

1.Zero-shot Object Counting

Diffusion Model

1.Person Image Synthesis via Denoising Diffusion Model

2.Controllable Mesh Generation Through Sparse Latent Point Diffusion Models

3.Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models

知识蒸馏

1.Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation

2.KD-DLGAN: Data Limited Image Generation via Knowledge Distillation

点云相关

1.ACL-SPC: Adaptive Closed-Loop system for Self-Supervised Point Cloud Completion

2.PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees

3.Neural Intrinsic Embedding for Non-rigid Point Cloud Matching

4.Point Cloud Forecasting as a Proxy for 4D Occupancy Forecasting

5.Rotation-Invariant Transformer for Point Cloud Matching

6.Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis

7.SCPNet: Semantic Scene Completion on Point Cloud

8.CLIP2Scene: Towards Label-Efficient 3D Scene Understanding by CLIP

9.PartManip: Learning Cross-Category Generalizable Part Manipulation Policy from Point Cloud Observations

10.Binarizing Sparse Convolutional Networks for Efficient Point Cloud Analysis

11.NerVE: Neural Volumetric Edges for Parametric Curve Extraction from Point Cloud

12.LidarGait: Benchmarking 3D Gait Recognition with Point Clouds

轨迹预测

1.IPCC-TP: Utilizing Incremental Pearson Correlation Coefficient for Joint Multi-Agent Trajectory Prediction

异常检测

1.Multimodal Industrial Anomaly Detection via Hybrid Fusion

4D Radar

1.Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision

目标检测

1.MixTeacher: Mining Promising Labels with Mixed Scale Teacher for Semi-Supervised Object Detection

2.Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR

3.Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection

4.Dense Distinct Query for End-to-End Object Detection

5.Detecting Everything in the Open World: Towards Universal Object Detection

6.One-to-Few Label Assignment for End-to-End Dense Detection

目标跟踪

1.Referring Multi-Object Tracking

2.Visual Prompt Multi-Modal Tracking

3.MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking

4.On the Benefits of 3D Pose and Tracking for Human Action Recognition

深度估计

1.Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation

2.HRDFuse: Monocular 360°Depth Estimation by Collaboratively Learning Holistic-with-Regional Depth Distributions

车道线检测

1.BEV-LaneDet: a Simple and Effective 3D Lane Detection Baseline

其它

1.PMatch: Paired Masked Image Modeling for Dense Geometric Matching

2.Detecting Everything in the Open World: Towards Universal Object Detection

3.One-to-Few Label Assignment for End-to-End Dense Detection

4.V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception