[Feat]: Task Revival Proposal
lukehinds opened this issue · 2 comments
Is your feature request related to a problem? Please describe.
Original discussion: #435
Currently, when a task reaches a terminal state (completed, failed, canceled, rejected), it cannot accept new messages. This creates issues for conversational agents that need to handle follow-up modifications to completed actions.
Example scenario:
User: "Book me a meeting with john@example.com next week at 14:00"
Agent: "Meeting booked for next Tuesday at 14:00" [task completed]
1 hour later, same task_id...
User: "Actually, can you move it to 16:00?"
Agent: ERROR - Task is in terminal state: completed
This forces users to start new tasks for modifications, losing context and creating a subpar experience for calendar scheduling, order modifications, and other conversational use cases.
Describe the solution you'd like
Proposed Solution
Add a hook in the AgentExecutor base class that lets agent developers control whether a completed task can be revived. The default is the current 'no'.
# In a2a/server/agent_execution/agent_executor.py
class AgentExecutor(ABC):
# ... existing abstract methods ...
async def can_revive_task(self, task: Task, message: Message) -> bool:
return False # Default: maintain current behaviorWe then update teh DefaultRequestHandler
# In DefaultRequestHandler._setup_message_execution()
if task:
if task.status.state in TERMINAL_TASK_STATES:
# Check if this is a completed task that should be revived
if (task.status.state == TaskState.completed and
await self.agent_executor.can_revive_task(task, params.message)):
# Revive task to working state
old_state = task.status.state
task.status.state = TaskState.working
# Update revival metadata
if not task.metadata:
task.metadata = {}
task.metadata["last_revival"] = datetime.now().isoformat()
task.metadata["revival_count"] = task.metadata.get("revival_count", 0) + 1
# Continue with normal message processing
task = task_manager.update_with_message(params.message, task)
else:
# Current behavior: reject terminal state messages
raise ServerError(
error=InvalidParamsError(
message=f'Task {task.id} is in terminal state: {task.status.state.value}'
)
)
else:
# Task is not terminal, process normally
task = task_manager.update_with_message(params.message, task)This then leaves it up to the Agent (or the individual managing the agent) and does not break backwards compat.
A user can then do something like the following in their inherited AgentExecutor
class OrderAgent(AgentExecutor):
async def can_revive_task(self, task: Task, message: Message) -> bool:
"""Allow order modifications until shipped."""
# It's already packaged and ready for delivery
if task.metadata.get("status") == "shipped":
return False
# Check if its within the allowed modification window (2 hours)
completed_at = datetime.fromisoformat(task.status.timestamp)
return (datetime.now() - completed_at) < timedelta(hours=2)Describe alternatives you've considered
I considered a simple boolean flag (allow_completed_task_revival=True) but decided not to, because:
- Too coarse-grained (all or nothing)
- No ability to implement business logic
- No ability to implement security/abuse controls
- Not extensible for different task types
Additional context
No response
Code of Conduct
- I agree to follow this project's Code of Conduct
So this essentially allows the developer to choose whether they want to opt-out of the terminal task immutability property. As a note, this part of the spec is currently only mentioned in Life of a Task.
Once a task has reached a terminal state (completed, cancelled, rejected or failed), it can't be restarted
@darrelmiller this is one of those probably-should-be-normative parts that we should pull out to the main spec.
This would technically be a protocol change, so I think we'll want consensus on whether to lift the terminal task immutability requirement, as well as the right way to do so. If you want to move forward with this, go ahead and propose it on the A2A repo directly.
I'll share my overall thoughts here:
I'd definitely be interested in hearing from others on how they feel about the tradeoffs for terminal task immutability. We originally introduced this concept because it can simplify some types of logic -- once a task has reached a finalized state, a client can keep a copy of it themselves and know that it will never change. The server can also drop all registered push notification configs, since nothing will ever change again. If a task can be restarted, it can be difficult for a client to know whether it needs to check on the state of a task. Imagine a poorly-coordinated distributed system case where one worker decides to restart a task and another worker wants to use the outputs of that task as part of its planning -- with immutability, the worker using the outputs just needs to wait for the task to reach a terminal state before it can use the outputs; without, it would need to know whether the task got restarted for some reason. I believe this was the idea for the statement of "Useful for mapping client orchestrator to task execution" in the Life of a Task document.
One downside to this property that I've always been unsatisfied with is that it kind of clashes with reality: the agent determines when it's done with the work, but the client is really the one who determines whether the task was satisfied. And if the task was not satisfied, the client will want the agent to continue. Does that mean a client needs to have its own concept of a super-task, such that it can group all related work it requested of an agent to perform the conceptual task it gave to the agent? Think: "please generate a picture of a cat and a dog" -> "[Task 1: done] Ok, here's a picture of a dog!" -> "I asked for a cat AND a dog. Do it again!" -> "[Task 2: done] Ok, here's a picture of a cat and a dog" -> "No, I mean one picture of a cat and another, separate, picture of a dog" -> "[Task 3: done] Ok, here's a picture of a dog" -> 😠 . It's up to the client to keep track of all these tasks that are related to the work it's attempting to get done. If a client could instead just keep reviving a task until it's happy with the result, that requirement goes away (but other complications arise).
So, tradeoffs, as usual.
A less controversial change would be to allow developers to change the current behavior of failing when a message is received on a terminal task to a behavior of starting a new task that references the previous task. That's both protocol-compliant (an agent is free to ignore or change whatever taskId was set in an incoming message) and seems useful.
I like your approach of having the AgentExecutor hold the logic. One option that would hold up here even if we did make the protocol change is to switch from a boolean to an enum:
# Describes what action to take when an incoming message references a task in a terminal state.
class TerminalTaskActionType(Enum)
# Fail the request due to referencing a terminal task.
FAIL = 1
# Create a new Task, with the old Task included in the related_tasks field.
CREATE_NEW_WITH_RELATED = 2
# In the future: REVIVE = 3
class AgentExecutor:
# Naming is hard.
async def get_terminal_task_action(self, task: Task, message: Message) -> TerminalTaskActionType:
return TerminalTaskAction.FAIL # Default: maintain current behaviorAlthough initially I was not a fan of the change to make servers own the lifetime of the task and the introduction of the terminal state, I have started to appreciate it. @mikeas1 I agree this needs to be a normative part of the specification.
@lukehinds I think the example you gave is actually a great example of why the notion of related tasks are better than restarting additional tasks. The user asked for the agent to book a meeting. The agent booked the meeting at the requested time. That is a reasonable conclusion of the requested task. The fact that the user changed their mind and then asked for that meeting to be rescheduled is a new "update meeting" task. It is closely related to the "create meeting" task but they are distinct tasks, in my opinion.
The reason I think they should be distinct tasks is because I think that tasks could be a really useful metric for measuring the value of agents. Ideally agents should complete as many tasks as possible with the minimum number of messages sent by the user. If a user has to send multiple messages to complete a task, then that is slightly less effective than a task getting completed with a single message. If it is possible to restart a task because the user changes their mind about the objective, then the agent "value" would be penalized because it now took multiple messages to complete a single task. With the current design and the suggested scenario, the agent has completed two tasks each only requiring a single input message. I think that is more representative of the effectiveness of the agent.
@lukehinds Is there a reason why the use of related tasks could not maintain the necessary context in order to keep a good user experience?