/Agent-Driver

A Language Agent for Autonomous Driving

Primary LanguagePythonMIT LicenseMIT

Agent-Driver

This is a repo of our arXiv pre-print Agent-Driver [Project Page].

Note: Running Agent-Driver requires an OpenAI API account

Introduction

Human-level driving is an ultimate goal of autonomous driving. Conventional approaches formulate autonomous driving as a perception-prediction-planning framework, yet their systems do not capitalize on the inherent reasoning ability and experiential knowledge of humans. In this paper, we propose a fundamental paradigm shift from current pipelines, exploiting Large Language Models (LLMs) as a cognitive agent to integrate human-like intelligence into autonomous driving systems. Our approach, termed Agent-Driver, transforms the traditional autonomous driving pipeline by introducing a versatile tool library accessible via function calls, a cognitive memory of common sense and experiential knowledge for decision-making, and a reasoning engine capable of chain-of-thought reasoning, task planning, motion planning, and self-reflection. Powered by LLMs, our Agent-Driver is endowed with intuitive common sense and robust reasoning capabilities, thus enabling a more nuanced, human-like approach to autonomous driving. We evaluate our approach on the large-scale nuScenes benchmark, and extensive experiments substantiate that our Agent-Driver significantly outperforms the state-of-the-art driving methods by a large margin. Our approach also demonstrates superior interpretability and few-shot learning ability to these methods.

Alt text

Installation

a. Clone this repository.

git clone https://github.com/PointsCoder/Agent-Driver.git

b. Install the dependent libraries as follows:

pip install -r requirements.txt 

Data Preparation

a. We used pre-cached data from the nuScenes dataset. The data can be downloaded at Google Drive.

b. You can put the downloaded data here:

Agent-Driver
├── data
│   ├── finetune
|   |   |── data_samples_train.json
|   |   |── data_samples_val.json
│   ├── memory
|   |   |── database.pkl
│   ├── metrics
|   |   |── gt_traj.pkl
|   |   |── gt_traj_mask.pkl
|   |   |── stp3_gt_seg.pkl
|   |   |── uniad_gt_seg.pkl
│   ├── train
|   |   |── [token].pkl
|   |   |── ...
│   ├── val
|   |   |── [token].pkl
|   |   |── ...
│   ├── split.json
├── agentdriver
├── scripts

Training

a. Before we start, we need to fine-tune a GPT-based motion planner (as in the reasoning engine). To do so, you first need to register an OpenAI API account.

b. After registration, you can generate your API-key and your oganization key in your account settings. Here is an example:

openai.api_key = "sk-**"
openai.organization = "org-**"

c. You need to specify your own keys in the agentdriver/llm_core/api_keys.py, and this will be used in running Agent-Driver.

Please note that this is your own key and will be linked to your bill payment, so keep this confidential and do not distribute it to others!

d. For fine-tuning a motion planner, simply run

sh scripts/run_finetune.sh

will automatically collect data and send finetuning jobs to OpenAI. More details can be found in agentdriver/execution/fine_tune.py.

Note: Fine-tuning costs money. Please refer to the pricing page. To save your money, by default we use 10% of the full training data for fine-tuning, from which you are supposed to get decent results with less than 10$ usd. You can get better results by using 100% data, and in this setting you may want to specify sample_ratio=1.0 in agentdriver/execution/fine_tune.py.

d. When your fine-tune job successfully completes, you will receive an email notifying your fine-tuned GPT model id, like this

ft:gpt-3.5-turbo-0613:**::**

This model id denotes your own GPT-based motion planner. You need to specify this model id in FINETUNE_PLANNER_NAME of agentdriver/llm_core/api_keys.py.

Inference

a. Once all keys in agentdriver/llm_core/api_keys.py have been set up correctly, you can inference the whole Agent-Driver pipeline in agentdriver/unit_test/test_lanuage_agent.ipynb.

b. You can also find the usage of individual components (tool library, cognitive memory, reasoning engine) in the folder agentdriver/unit_test.

Evaluation

a. If you want to evaluate the planning performance on nuScenes validation set, you can first collect the motion planning results by running

sh scripts/run_inference.sh

You will get a pred_trajs_dict.pkl in the experiments folder.

b. For evaluation, you can run

sh scripts/run_evaluation.sh uniad YOUR_PRED_DICT_PATH

with your pred_trajs_dict.pkl file location.

Citation

If you find this project useful in your research, please consider citing:

@article{agentdriver,
  title={A Language Agent for Autonomous Driving},
  author={Mao, Jiageng and Ye, Junjie and Qian, Yuxi and Pavone, Marco and Wang, Yue},
  year={2023}
}