Tool decorated with `@tool` returns string instead of structured Pydantic object in `on_tool_end` callback
Closed this issue · 2 comments
Checked other resources
- This is a bug, not a usage question.
- I added a clear and descriptive title that summarizes this issue.
- I used the GitHub search to find a similar question and didn't find it.
- I am sure that this is a bug in LangChain rather than my code.
- The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
- This is not related to the langchain-community package.
- I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
- I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.
Example Code
I'm using the @tool decorator from langchain_core.tools to define an async tool that returns a Pydantic model (CreateTestDataResponse). While the return type is correctly defined and the function returns a valid Pydantic instance, when observing the output in the on_tool_end callback (via tracing), the result appears as a stringified format, rather than the original structured JSON or Pydantic object.
Error Message and Stack Trace (if applicable)
from langchain_core.tools import tool
from pydantic import BaseModel, Field
from typing import List, Optional
class TestDataExample(BaseModel):
prompt: str = Field(..., description="Prompt text")
max_tokens: Optional[int] = Field(None, description="Max tokens")
expected: Optional[str] = Field(None, description="Expected output")
class CreateTestDataResponse(BaseModel):
data_url: str
status: str
url: str
test_type: str
model: str
total_count: int
examples: List[TestDataExample]
message: str
@tool(parse_docstring=True, return_direct=False)
async def create_test_data(test_type: str, test_model_id: str, sample_count: int = 100) -> CreateTestDataResponse:
"""Create test data.
Args:
test_type: Type of test, e.g., performance/functional.
test_model_id: Model ID, e.g., qw35/qw45.
sample_count: Number of samples, default 100.
Returns:
CreateTestDataResponse: Structured response with test data info.
"""
# ... (mock logic to build response)
return CreateTestDataResponse(
data_url=data_url,
status="success",
url=url,
test_type=test_type,
model=test_model_id,
total_count=sample_count,
examples=example_objects,
message="Test data created successfully"
)
In the on_tool_end callback or in LangSmith traces, I expect the tool's output to be accessible as a structured dictionary or JSON-compatible object (ideally preserving Pydantic types), so that downstream processing can access nested fields like examples[0].prompt directly.
Actual Behavior
The output in on_tool_end is received as a formatted string, for example:
data_url='data-68280' status='success' url='https://data.example.com/data-68280' test_type='functional' model='qw35' total_count=20 examples=[TestDataExample(prompt='1+1=?', max_tokens=None, expected='2'), ...] message='测试数据创建成功'
This makes it difficult to parse or extract structured information without fragile string parsing.
Question
How can I ensure that the tool preserves the return value as a JSON-serializable dict or raw Pydantic object in callbacks/tracing, instead of being converted to a string representation?
Is this related to serialization settings in @tool, or is there a way to configure the run tracer to keep structured outputs?
Any guidance or workaround would be appreciated!
Description
Question
How can I ensure that the tool preserves the return value as a JSON-serializable dict or raw Pydantic object in callbacks/tracing, instead of being converted to a string representation?
Is this related to serialization settings in @tool, or is there a way to configure the run tracer to keep structured outputs?
Any guidance or workaround would be appreciated!
System Info
System Information
OS: Darwin
OS Version: Darwin Kernel Version 23.5.0: Wed May 1 20:14:59 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T8122
Python Version: 3.11.9 (v3.11.9:de54cf5be3, Apr 2 2024, 07:12:50) [Clang 13.0.0 (clang-1300.0.29.30)]
Package Information
langchain_core: 0.3.74
langchain: 0.3.27
langchain_community: 0.3.27
langsmith: 0.4.4
langchain_anthropic: 0.3.3
langchain_aws: 0.2.18
langchain_mcp_adapters: 0.1.9
langchain_openai: 0.3.1
langchain_text_splitters: 0.3.9
langchainhub: 0.1.15
langgraph_sdk: 0.1.74
Optional packages not installed
langserve
Other Dependencies
aiohttp<4.0.0,>=3.8.3: Installed. No version info available.
anthropic: 0.64.0
async-timeout<5.0.0,>=4.0.0;: Installed. No version info available.
boto3: 1.39.9
dataclasses-json<0.7,>=0.5.7: Installed. No version info available.
defusedxml: 0.7.1
httpx: 0.27.0
httpx-sse<1.0.0,>=0.4.0: Installed. No version info available.
httpx>=0.25.2: Installed. No version info available.
jsonpatch<2.0,>=1.33: Installed. No version info available.
langchain-anthropic;: Installed. No version info available.
langchain-aws;: Installed. No version info available.
langchain-azure-ai;: Installed. No version info available.
langchain-cohere;: Installed. No version info available.
langchain-community;: Installed. No version info available.
langchain-core<0.4,>=0.3.36: Installed. No version info available.
langchain-core<1.0.0,>=0.3.66: Installed. No version info available.
langchain-core<1.0.0,>=0.3.72: Installed. No version info available.
langchain-deepseek;: Installed. No version info available.
langchain-fireworks;: Installed. No version info available.
langchain-google-genai;: Installed. No version info available.
langchain-google-vertexai;: Installed. No version info available.
langchain-groq;: Installed. No version info available.
langchain-huggingface;: Installed. No version info available.
langchain-mistralai;: Installed. No version info available.
langchain-ollama;: Installed. No version info available.
langchain-openai;: Installed. No version info available.
langchain-perplexity;: Installed. No version info available.
langchain-text-splitters<1.0.0,>=0.3.9: Installed. No version info available.
langchain-together;: Installed. No version info available.
langchain-xai;: Installed. No version info available.
langchain<1.0.0,>=0.3.26: Installed. No version info available.
langsmith-pyo3: Installed. No version info available.
langsmith>=0.1.125: Installed. No version info available.
langsmith>=0.1.17: Installed. No version info available.
langsmith>=0.3.45: Installed. No version info available.
mcp>=1.9.2: Installed. No version info available.
numpy: 1.26.4
numpy>=1.26.2;: Installed. No version info available.
numpy>=2.1.0;: Installed. No version info available.
openai: 1.58.1
openai-agents: Installed. No version info available.
opentelemetry-api: 1.36.0
opentelemetry-exporter-otlp-proto-http: Installed. No version info available.
opentelemetry-sdk: 1.36.0
orjson: 3.11.0
orjson>=3.10.1: Installed. No version info available.
packaging: 24.2
packaging>=23.2: Installed. No version info available.
pydantic: 2.11.7
pydantic-settings<3.0.0,>=2.4.0: Installed. No version info available.
pydantic<3.0.0,>=2.7.4: Installed. No version info available.
pydantic>=2.7.4: Installed. No version info available.
pytest: 7.4.0
PyYAML>=5.3: Installed. No version info available.
requests: 2.32.4
requests-toolbelt: 1.0.0
requests<3,>=2: Installed. No version info available.
rich: 14.0.0
SQLAlchemy<3,>=1.4: Installed. No version info available.
tenacity!=8.4.0,<10,>=8.1.0: Installed. No version info available.
tenacity!=8.4.0,<10.0.0,>=8.1.0: Installed. No version info available.
tiktoken: 0.9.0
types-requests: 2.32.4.20250611
typing-extensions>=4.14.0: Installed. No version info available.
typing-extensions>=4.7: Installed. No version info available.
zstandard: 0.23.0
Hello,I recommend trying the response_format="content_and_artifact" parameter in your @tool decorator. This should solve the problem of Pydantic objects being stringified in the on_tool_end callback.
Here's how to modify your tool:
@tool(parse_docstring=True, return_direct=False, response_format="content_and_artifact")
async def create_test_data(test_type: str, test_model_id: str, sample_count: int = 100) -> tuple[str, CreateTestDataResponse]:
"""Create test data."""
# Your existing logic to build the response...
response = CreateTestDataResponse(
data_url=data_url,
status="success",
url=url,
test_type=test_type,
model=test_model_id,
total_count=sample_count,
examples=example_objects,
message="Test data created successfully"
)
# Return a tuple: (content_for_model, structured_data)
summary = f"Successfully created {sample_count} test examples for {test_type} testing"
return summary, response
The key changes:
Add response_format="content_and_artifact" to your @tool decorator
Change your return type to tuple[str, CreateTestDataResponse]
Return a tuple: (summary_message, pydantic_object)
In your on_tool_end callback, you can then access the structured data via:
def on_tool_end(self, output: Any, **kwargs) -> None:
if hasattr(output, 'artifact') and output.artifact:
structured_data = output.artifact # This will be your CreateTestDataResponse object
# Now you can access: structured_data.examples[0].prompt
This approach preserves the original Pydantic object in the artifact field while providing a clean summary to the model in the content field.
Hello,I recommend trying the response_format="content_and_artifact" parameter in your @tool decorator. This should solve the problem of Pydantic objects being stringified in the on_tool_end callback. Here's how to modify your tool:
@tool(parse_docstring=True, return_direct=False, response_format="content_and_artifact") async def create_test_data(test_type: str, test_model_id: str, sample_count: int = 100) -> tuple[str, CreateTestDataResponse]: """Create test data.""" # Your existing logic to build the response... response = CreateTestDataResponse( data_url=data_url, status="success", url=url, test_type=test_type, model=test_model_id, total_count=sample_count, examples=example_objects, message="Test data created successfully" ) # Return a tuple: (content_for_model, structured_data) summary = f"Successfully created {sample_count} test examples for {test_type} testing" return summary, responseThe key changes: Add response_format="content_and_artifact" to your @tool decorator Change your return type to tuple[str, CreateTestDataResponse] Return a tuple: (summary_message, pydantic_object) In your
on_tool_endcallback, you can then access the structured data via:def on_tool_end(self, output: Any, **kwargs) -> None: if hasattr(output, 'artifact') and output.artifact: structured_data = output.artifact # This will be your CreateTestDataResponse object # Now you can access: structured_data.examples[0].promptThis approach preserves the original Pydantic object in the artifact field while providing a clean summary to the model in the content field.
Thank you so much for your suggestion! I've tried using response_format="content_and_artifact" as you recommended, and it works perfectly. By returning a tuple of (summary, pydantic_object) and accessing the structured data via output.artifact in the on_tool_end callback, I'm now able to preserve the full Pydantic object without it being stringified.
For anyone else facing a similar issue, here’s how I handled the output:
async def on_tool_end(self, output: Any, **kwargs: Any) -> None:
tool_output = to_serializable(output.artifact or output.content)
def to_serializable(obj):
"""
serializable
"""
if isinstance(obj, BaseModel):
return obj.model_dump()
elif isinstance(obj, list):
return [to_serializable(i) for i in obj]
elif isinstance(obj, dict):
return {k: to_serializable(v) for k, v in obj.items()}
elif isinstance(obj, str):
try:
return json.loads(obj)
except:
return obj
else:
return obj