[MLAQ Demo Website] [Experimental Data] [Introduction Video]
This is the codebase for paper: Towards Optimal Long-Horizon Decision-Making by Model-based LLM Agent with Q-Planner. We welcome readers to run our code, and if you have any questions, please raise them in the issues. Additionally, we have created an interactive demo demonstration in mlaq.site. Readers are encouraged to visit the website to gain a deeper understanding of our work.
Notice #1: We recorded all the data during our experiments and stored it in Google Drive.
Notice #2: We provides a demo website for readers to better understand our work in mlaq.site.
conda create -n mlaq python=3.8
conda activate mlaq
pip install mujoco==2.3.0
pip install dm_control==1.0.8
pip install -r requirements.txt
In run_MLAQ.py, replace the following codes with your OpenAI API key and API base URL
OPENAI_KEY = str("your_openai_key_here")
os.environ["OPENAI_API_KEY"] = OPENAI_KEY
os.environ["OPENAI_BASE_URL"] = "your_openai_base_url_here"
openai.api_based = "your_openai_base_url_here"
openai.api_key = OPENAI_KEY
You can change the blocksworld domain to other domains by changing the --task
argument, including blocksworld
, sort_central
, sort_dialog
, sandwich_central
, and sandwich_dialog
.
$ conda activate mlaq
(mlaq) $ python run_MLAQ.py --task blocksworld -llm gpt-4-0125-preview --skip_display --tree_load --optimal_steps=2 --start_run_id=0 --start_case_id=0