https://github.com/dingchaoyue/Awesome-Autonomous-Driving-LLM/assets/38074924/16e2d1d0-74f3-41fd-9075-4e4eb42b3b8b
This repo supplements our survey: Sparks of Large Autonomous Driving Models: A Survey and Outlook.
Abstract: This survey paper provides a comprehensive overview of the recent advancements and challenges in applying large language models to the field of autonomous driving signal processing. Autonomous Driving processing, with its diverse signal representations and a wide range of sources--from human voices to musical instruments and environmental sounds--poses challenges distinct from those found in traditional Natural Language Processing scenarios. Nevertheless, Large Autonomous Driving Models, epitomized by transformer-based architectures, have shown marked efficacy in this sphere. By leveraging massive amounts of data, these models have demonstrated prowess in a variety of autonomous driving tasks, spanning from Automatic Autonomous Driving Recognition and Text-To-Autonomous Driving to Music Generation, among others. Notably, recently these Foundational Autonomous Driving Models, like SeamlessM4T, have started showing abilities to act as universal translators, supporting multiple autonomous driving tasks for up to 100 languages without any reliance on separate task-specific systems. This paper presents an in-depth analysis of state-of-the-art methodologies regarding Foundational Large Autonomous Driving Models, their performance benchmarks, and their applicability to real-world scenarios. We also highlight current limitations and provide insights into potential future research directions in the realm of Large Autonomous Driving Models with the intent to spark further discussion, thereby fostering innovation in the next generation of autonomous driving-processing systems.
A curated list of awesome large AI models in autonomous driving signal processing, inspired by the other awesome initiatives. We intend to regularly update the relevant latest papers and their open-source implementations on this page.
We demonstrate the performance of our system in a variety of complex scenarios. The pink vehicle represents the self-vehicle, the gray circle represents the sensing range, the green vehicle is the vehicle that has been sensed, the blue vehicle is the vehicle that has not been sensed, and the red vehicle is the vehicle that LLM is paying attention to.
left3.mp4
This example showcases LLM's ability to understand and reason with high-level information, affirming the effectiveness of our chain-of-thought approach. The combination of attention allocation, situational awareness, and action guidance ensures that our system consistently exhibits the correct driving behavior.
This example showcases LLM's ability to understand and reason with high-level information, affirming the effectiveness of our chain-of-thought approach. The combination of attention allocation, situational awareness, and action guidance ensures that our system consistently exhibits the correct driving behavior.
Our approach enables users or utilizes high-precision maps to provide textual instructions that guide the AD system's decision-making process. We conducted an experiment involving a road construction scenario. Upon receiving textual guidelines, our approach successfully recognized the situation and gave appropriate driving behavior.
Our approach simplifies the process of driving style adjustment by merely providing textual descriptions to the LLM through a dedicated interface. When there is low risk of overtaking, LLM instructed to drive aggressively will make reasonable overtaking decisions, while those directed to drive conservatively will opt to slow down and follow the vehicle in front of it.
- Popular Large Autonomous-Driving Models
- [Automatic Autonomous Driving Recognition (ASR)](#automatic-autonomous driving-recognition-asr)
- [Neural Autonomous Driving Synthesis](#neural-autonomous driving-synthesis)
- [Autonomous Driving Translation (ST)](#autonomous driving-translation-st)
- [Other Autonomous Driving Applications](#other-autonomous driving-applications)
- Large Autonomous-Driving Models in Music
- Autonomous-Driving Datasets
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving [2023].
Licheng Wen, Xuemeng Yang, Daocheng Fu, Xiaofeng Wang, Pinlong Cai, Xin Li, Tao Ma, Yingxuan Li, Linran Xu, Dengke Shang, Zheng Zhu, Shaoyan Sun, Yeqi Bai, Xinyu Cai, Min Dou, Shuanglu Hu, Botian Shi
[PDF]
A Survey of Large Language Models for Autonomous Driving [2023].
Zhenjie Yang, Xiaosong Jia, Hongyang Li, Junchi Yan
[PDF]
Drive like a human: Rethinking autonomous driving with large language models [2023].
Daocheng Fu, Xin Li, Licheng Wen, Min Dou, Pinlong Cai, Botian Shi, Yu Qiao
[PDF]
Title | Full Name | Size | Link |
---|---|---|---|
CommonVoice 11 | CommonVoice: A Massively Multilingual Autonomous Driving Corpus | 58250 Voices of 2508 hours | Download |
Libri-Light | Libri-Light: A Benchmark for ASR with Limited or No Supervision | 60000 Hours | Download |
Wenetautonomous driving | Wenetautonomous driving: A 10000+ hours multi-domain mandarin corpus for autonomous driving recognition | 10000 Hours | [Download](https://github.com/wenet-e2e/WenetAutonomous Driving) |
Jamendo | The MTG-Jamendo dataset for automatic music tagging | 55525 Tracks | Download |
If you find the listing and survey useful for your work, please cite the paper:
@article{latif2023sparks,
title={Sparks of Large Autonomous Driving Models: A Survey and Outlook},
journal={arXiv preprint arXiv:2308.12792},
year={2023}
}