The modular agent is structured as an Agent, which includes a State, which includes a list of Nodes, which are wrapped Messages, which are meant as a common format for different APIs' chat prompt elements. Defined in the agent are names for 5 different modules that chiefly interact with the State. The module functions never return anything, and always just take the Agent class (with a State attribute) as their single argument. The module functions coordinate with one another by modifying the state's next_step
attribute, which is a dictionary that should contain both a module_type
value, and an args
subdictionary. Modules generally get their de facto arguments from this subdictionary. The modules are:
- Prompter: decides on a list of Messages to be used for the generation request. In general, it will populate
agent.state.next_step["module_type"]
with"generator"
, andagent.state.next_step["args"]["messages"]
with the list of messages. The most basic prompter will just return the unwrapped Messages in sequence from the list of Nodes, but more complex prompters can trim or alter the history of messages, include additional messages, etc. - Generator: produces generations from Middleman, generally using the prompter's list of messages as the chat prompt. It is the generator's responsibility to format the request in a way appropriate for the specific model it uses, and to interpret the response into a list of Messages. In general, it will populate
agent.state.next_step["module_type"]
with"discriminator"
,agent.state.next_step["args"]["options"]
with the list of processed generation options (in the form of Messages), andagent.state.next_step["args"]["generation_metadata"]
with any metadata about the generations to be added to the Node that will result from one of those generations (see next module). - Discriminator: produces a single Node that is added to the state's list of nodes. This can be directly based off a Message produced by the generator, or can be a new Message. In general, it will populate
agent.state.next_step["module_type"]
with"actor"
, and will not modifyagent.state.next_step["args"]
, instead directly adding the resulting Node to the agent's state. - Actor: makes any function call implied by the agent's state (generally by just looking at the last added node and checking if its message contains a function call), and adds a new Node with the function output, if applicble. In general, it will populate
agent.state.next_step["module_type"]
with"prompter"
, and will not modifyagent.state.next_step["args"]
, instead directly adding the resulting Node to the agent's state. NOTE: for more flexible agents, we may want the Discriminator to instead pass in a list of node IDs to be considered by the Actor, instead of just having the Actor look at the last node. - Toolkit: lists the tools available to the agent. It is not a step in the agent loop.
In principle, all combinations of modules should be supported and make sense. In practice this isn't quite the case (but the mismatches should be the exception rather than the rule!).
The State ends up being very rich, and a substantial amount of agent debugging can be done by using fixed states and manually setting e.g. agent.state.next_step
to hand-crafted values.