Arxiv preprint | DSPy Implementation
AvaTaR is a novel and automatic framework that optimizes an LLM agent to effectively use the provided tools and improve its performance on a given task/domain. During optimization, we design a comparator module to iteratively provide insightful and holistic prompts to the LLM agent via reasoning between positive and negative examples sampled from training data.
[July 2024] 🔥 Avatar is integrated into DSPy - Credit to Herumb Shandilya! You can try out the example on jupyter notebook.
conda create -n avatar python=3.11
pip install stark-qa typeguard
- Specify API keys in command line
export ANTHROPIC_API_KEY=YOUR_API_KEY
export OPENAI_API_KEY=YOUR_API_KEY export OPENAI_ORG=YOUR_ORGANIZATION
- Embeddings: Download all embeddings by running the following script:
sh scripts/emb_download_all.sh
- Raw data:
STaRK data will be downloaded automatically when running the code.
For Flickr30k Entities, submit form at Flickr 30k & Denotation Graph data to request access. Then organize the data as follows:
data ├── flickr30k_entities │ ├── raw │ │ ├── Annotations │ │ │ ├── 36979.xml │ │ │ ├── ... │ │ ├── flickr30k-images │ │ ├── 36979.jpg │ │ ├── ... │ ├── split │ │ ├── test.index │ │ ├── train.index │ │ ├── val.index │ ├── qa.csv ├── ...
We already include the VSS results locally under output/eval
and the grouping (for STaRK only) under output/agent
. With these files, you should be able to optimize actor actions directly following the AvaTaR pipeline.
- Optimization: Following the default settings at
config/default_args.json
, run the following command to optimize the actor actions for a group of queries:You can specify the dataset name and group insh scripts/run_avatar_stark.sh
scripts/run_avatar_stark.sh
.sh run_avatar_flickr30k_entities.sh
- Evaluation: Run the following command to evaluate the optimized actor actions:
or
sh scripts/run_eval_avatar_stark.sh
sh scripts/run_eval_avatar_flickr30k_entities.sh
Avatar is now integrated with DSPy as Avatar
Module for agent execution and AvatarOptimizer
for Actor optimization. To use Avatar you'll need: Task Signature and Tools.
- Task Signature is a
dspy.Signature
class defining the structure of your task. So if your task is of QA type you can create a signature withquestion
input field andanswer
output field. - Tools is a list of
dspy.Tools
containing all the tools of langchain tool format.
Here is a
from dspy.predict.avatar import Tool, Avatar
from langchain_community.utilities import GoogleSerperAPIWrapper, ArxivAPIWrapper
tools = [
Tool(
tool=GoogleSerperAPIWrapper(),
name="WEB_SEARCH",
desc="If you have a question, you can use this tool to search the web for the answer."
),
]
agent = Avatar(
tools=tools,
signature="question->answer",
verbose=True,
)
You can execute it like any other DSPy module by passing the inputs you specified in your task signature:
answer = agent(question)
You can optimize the Actor for optimal tool usage using AvatarOptimizer
which optimizes it using the comparator module:
from dspy.teleprompt import AvatarOptimizer
def metric(example, prediction, trace=None):
...
teleprompter = AvatarOptimizer(
metric=metric,
max_iters=1,
max_negative_inputs=10,
max_positive_inputs=10,
)
optimized_arxiv_agent = teleprompter.compile(
student=agent,
trainset=trainset
)
For a detailed walkthrough, you can refer to the notebook in DSPy repo.
@article{wu24avatar,
title = {AvaTaR: Optimizing LLM Agents for Tool-Assisted Knowledge Retrieval},
author = {
Shirley Wu and Shiyu Zhao and
Qian Huang and Kexin Huang and
Michihiro Yasunaga and Kaidi Cao and
Vassilis N. Ioannidis and Karthik Subbian and
Jure Leskove and James Zou
},
eprinttype = {arXiv},
eprint = {2406.11200},
year = {2024}
}