Here is a collection of research papers and the relevant valuable open-source resources for awesome knowledge-driven autonomous driving (AD). The repository will be continuously updated to track the frontier of knowledge-driven AD.
🌟 Welcome to star and contribute to (PR) this awesome knowledge-driven AD! 🌟
[2023.12.08] New: We release the survey 'Towards Knowledge-driven Autonomous Driving'! [2023.10.24] New: We release the awesome knowledge-driven AD!
The autonomous driving community has witnessed substantial growth in approaches that embrace a knowledge-driven paradigm. Here, we delve into knowledge-driven autonomous driving, exploring motivations, components, challenges, and prospects. More details of knowledge-driven autonomous driving can be found in our paper.
Key components in knowledge-driven AD.
Knowledge-aug. Dataset | Sensors | Knowledge Form | Tasks | Metrics |
---|---|---|---|---|
BDD-X | C | Explanation | Vehicle Control, Explanation Generation, Scene Captioning | MAE, MDC, BLEU-4, METEOR, CIDEr-D |
Cityscapes-Ref | C | Object Referral, Gaze Heatmap | Object Referring | Acc@1 |
DR(eye)VE | C | Gaze Heatmap | Gaze Prediction | CC, KLD, IG |
HAD | C | Advice | Vehicle Control | MAE, MDC |
Talk2Car | C+L+R | Object Referral | Object Referring | IoU@0.5 |
DADA-2000 | C | Gaze Heatmap, Crash Objects, Accident Window | Gaze Prediction | CC, KLD, NSS, SIM |
HDBD | C | Gaze Heatmap, Takeover Intention | Driver Takeover Detection | AUC |
Refer-KITTI | C+L | Object Referral | Object Referring, Object Tracking | HOTA |
DRAMA | C | Advice, Risk Localization | Motion Planning | L2 Error, Collision Rate |
Rank2Tell | C+L | Object Referral, Importance Ranking | Importance Estimation, Scene Captioning | F1 Score, Accuracy, BLEU-4, METEOR, ROUGE, CIDER |
DriveLM | C+L+R | Scene Captioning, Question Answering | Scene Captioning, Question Answering | - |
NuScenes-QA | C+L+R | Question Answering | Question Answering | Exist, Count, Object, Status, Comparison, Acc |
DESIGN | C+L+R | Scene Captioning, Question Answering | Question Answering, Motion Planning | BLEU-4, METEOR, ROUGE, L2 Error, Collision Rate |
Reason2Drive | C+L | Question Answering | Question Answering | BLEU-4, METEOR, ROUGE, CIDER |
- UniSim: A Neural Closed-Loop Sensor Simulator[
CVPR 2023
, Project] - NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles[
arxiv 2023
, Github] - DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model [
arxiv 2023
, Project] - OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving [
arxiv 2023
, Project] - ADriver-I: A General World Model for Autonomous Driving [
arxiv 2023
] - Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving [
arxiv 2023
, Project, Github] - WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation [
arxiv 2023
, Github] - DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving [
arxiv 2023
] - MagicDrive: Street View Generation with Diverse 3D Geometry Control [
arxiv 2023
] - GAIA-1: A Generative World Model for Autonomous Driving [
arxiv 2023
] - Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research [
NeurIPS 2023
, Github] - MUVO: A Multimodal Generative World Model for Autonomous Driving with Geometric Representations [
arxiv 2023
] - Natural-language-driven Simulation Benchmark and Copilot for Efficient Production of Object Interactions in Virtual Road Scenes [
arxiv 2023
] - LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs [
arxiv 2023
]
- Grounding human-to-vehicle advice for self-driving vehicles [
CVPR 2019
] - ADAPT: Action-aware Driving Caption Transformer [
ICRA 2023
, Github] - Talk to the Vehicle: Language Conditioned Autonomous Navigation of Self Driving Cars [
IROS 2019
] - Talk2Car: Taking Control of Your Self-Driving Car [
EMNLP-IJNLP 2019
, Project] - Textual explanations for self-driving vehicles [
ECCV 2018
, Github] - Drive Like a Human: Rethinking Autonomous Driving with Large Language Models [
arxiv 2023
, Github] - DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model] [
arxiv 2023
, Project] - DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models [
arxiv 2023
, Github] - GPT-Driver: Learning to Drive with GPT [
arxiv 2023
, Github] - Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving [
arxiv 2023
, Github] - LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving [
arxiv 2023
, Project] - Receive, Reason, and React: Drive as You Say with Large Language Models in Autonomous Vehicles [
arxiv 2023
] - Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles [
arxiv 2023
] - SurrealDriver: Designing Generative Driver Agent Simulation Framework in Urban Contexts based on Large Language Model [
arxiv 2023
] - Language-Guided Traffic Simulation via Scene-Level Diffusion [
arxiv 2023
] - Language Prompt for Autonomous Driving [
arxiv 2023
, Github] - Talk2BEV: Language-Enhanced Bird's Eye View (BEV) Maps [
arxiv 2023
, Project, Github] - BEVGPT: Generative Pre-trained Large Model for Autonomous Driving Prediction, Decision-Making, and Planning [
arxiv 2023
] - HiLM-D: Towards High-Resolution Understanding in Multimodal Large Language Models for Autonomous Driving [
arxiv 2023
] - Can you text what is happening? Integrating pre-trained language encoders into trajectory prediction models for autonomous driving [
arxiv 2023
] - OpenAnnotate3D: Open-Vocabulary Auto-Labeling System for Multi-modal 3D Data [
arxiv 2023
, Github] - LangProp: A Code Optimization Framework Using Language Models Applied to Driving [
openreview 2023
, Github] - Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion [
openreview 2023
] - Planning with an Ensemble of World Models [
openreview 2023
] - Large Language Models Can Design Game-Theoretic Objectives for Multi-Agent Planning [
openreview 2023
] - TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction [
arxiv 2023
] - BEV-CLIP: Multi-Modal BEV Retrieval Methodology for Complex Scene in Autonomous Driving [
arxiv 2023
] - Large Language Models Can Design Game-theoretic Objectives for Multi-Agent Planning [
openreview 2023
] - Semantic Anomaly Detection with Large Language Models [
arxiv 2023
] - Driving through the Concept Gridlock: Unraveling Explainability Bottlenecks in Automated Driving [
arxiv 2023
] - Drama: Joint risk localization and captioning in driving [
WACV 2023
] - 3D Dense Captioning Beyond Nouns: A Middleware for Autonomous Driving [
openreview 2023
] - SwapTransformer: Highway Overtaking Tactical Planner Model via Imitation Learning on OSHA Dataset [
openreview 2023
] - NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario [
arxiv 2023
, Github] - Language Prompt for Autonomous Driving [
arxiv 2023
, Github] - Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models [
arxiv 2023
] - Addressing Limitations of State-Aware Imitation Learning for Autonomous Driving [
arxiv 2023
] - A Language Agent for Autonomous Driving [
arxiv 2023
] - Human-Centric Autonomous Systems With LLMs for User Command Reasoning [
WACVW 2024
] - On the Road with GPT-4V (ision): Early Explorations of Visual-Language Model on Autonomous Driving [
arxiv 2023
] - Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving [
arxiv 2023
, Github] - GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models [
arxiv 2023
, Github] - ChatGPT as Your Vehicle Co-Pilot: An Initial Attempt [
IEEE TIV 2023
] - DriveLLM: Charting The Path Toward Full Autonomous Driving with Large Language Models [
IEEE TIV 2023
]
- Applications of Large Scale Foundation Models for Autonomous Driving [
arxiv 2023
] - A Survey on Multimodal Large Language Models for Autonomous Driving [
arxiv 2023
] - A Survey of Large Language Models for Autonomous Driving [
arxiv 2023
] - Vision Language Models in Autonomous Driving and Intelligent Transportation Systems [
arxiv 2023
] - A Survey of Simulators for Autonomous Driving: Taxonomy, Challenges, and Evaluation Metric [
arxiv 2023
] - Towards Knowledge-driven Autonomous Driving [
arxiv 2023
]
- [WACV2024 Workshop] MAPLM: A Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding
- [Blog] LINGO-1: Exploring Natural Language for Autonomous Driving
- [Blog] Introducing GAIA-1: A Cutting-Edge Generative AI Model for Autonomy
If you find our paper useful, please kindly cite us via:
@article{li2023knowledgedriven,
title={Towards Knowledge-driven Autonomous Driving},
author={Li, Xin and Bai, Yeqi and Cai, Pinlong and Wen, Licheng and Fu, Daocheng and Zhang, Bo and Yang, Xuemeng and Cai, Xinyu and Ma, Tao and Guo, Jianfei and Gao, Xing and Dou, Min and Shi, Botian and Liu, Yong and He, Liang and Qiao, Yu},
journal={arXiv preprint arXiv:2312.04316},
year = {2023}
}
Awesome Knowledge-driven Autonomous Driving is released under the Apache 2.0 license.