agentq utilises various kinds of agentic architectures to complete a task on the web reliably. it has
1. a planner <> navigator multi-agent architecutre
2. a solo planner-actor agent
3. an actor <> critic multi-agent architecture
4. actor <> critic architecture + monte carlo tree search based reinforcement learning + dpo finetuning
this repo also contains an oss implementation of the research paper agent q - thus the name.
-
we recommend installing poetry before proceeding with the next steps. you can install poetry using these instructions
-
install dependencies
poetry install
- start chrome in dev mode - in a seaparate terminal, use the command to start a chrome instance and do necesssary logins to job websites like linkedin/ wellfound, etc.
for mac, use command -
sudo /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
for linux -
google-chrome --remote-debugging-port=9222
for windows -
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
-
set up env - add openai and langfuse keys to .env file. you can refer .env.example. currently adding langfuse is required. If you do not want tracing - then you can do the following changes
- directly import open ai client via
import openai
rather thanfrom langfuse.openai import openai
in the./agentq/core/agent/base.py
file. - you would also have to comment out the @obseve decorator and the below piece of code from the
run
function in the same file
langfuse_context.update_current_trace( name=self.agnet_name, session_id=session_id )
- directly import open ai client via
-
run the agent
python -u -m agentq
python -m test.tests_processor --orchestrator_type fsm
python -m agentq.core.mcts.browser_mcts
a bunch of amazing work in the space has inspired this.
@misc{putta2024agentqadvancedreasoning,
title={Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents},
author={Pranav Putta and Edmund Mills and Naman Garg and Sumeet Motwani and Chelsea Finn and Divyansh Garg and Rafael Rafailov},
year={2024},
eprint={2408.07199},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2408.07199},
}
@inproceedings{yao2022webshop,
bibtex_show = {true},
title = {WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents},
author = {Yao, Shunyu and Chen, Howard and Yang, John and Narasimhan, Karthik},
booktitle = {ArXiv},
year = {preprint},
html = {https://arxiv.org/abs/2207.01206},
tag = {NLP}
}
@article{he2024webvoyager,
title={WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models},
author={He, Hongliang and Yao, Wenlin and Ma, Kaixin and Yu, Wenhao and Dai, Yong and Zhang, Hongming and Lan, Zhenzhong and Yu, Dong},
journal={arXiv preprint arXiv:2401.13919},
year={2024}
}
@misc{abuelsaad2024-agente,
title={Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems},
author={Tamer Abuelsaad and Deepak Akkil and Prasenjit Dey and Ashish Jagmohan and Aditya Vempaty and Ravi Kokku},
year={2024},
eprint={2407.13032},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2407.13032},
}