Tool decorated with `@tool` returns string instead of structured Pydantic object in `on_tool_end` callback

Question

Tool decorated with `@tool` returns string instead of structured Pydantic object in `on_tool_end` callback

Closed this issue 2 months ago · 2 comments

Checked other resources

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Example Code

I'm using the @tool decorator from langchain_core.tools to define an async tool that returns a Pydantic model (CreateTestDataResponse). While the return type is correctly defined and the function returns a valid Pydantic instance, when observing the output in the on_tool_end callback (via tracing), the result appears as a stringified format, rather than the original structured JSON or Pydantic object.

Error Message and Stack Trace (if applicable)

from langchain_core.tools import tool
from pydantic import BaseModel, Field
from typing import List, Optional

class TestDataExample(BaseModel):
prompt: str = Field(..., description="Prompt text")
max_tokens: Optional[int] = Field(None, description="Max tokens")
expected: Optional[str] = Field(None, description="Expected output")

class CreateTestDataResponse(BaseModel):
data_url: str
status: str
url: str
test_type: str
model: str
total_count: int
examples: List[TestDataExample]
message: str

@tool(parse_docstring=True, return_direct=False)
async def create_test_data(test_type: str, test_model_id: str, sample_count: int = 100) -> CreateTestDataResponse:
"""Create test data.

Args:
    test_type: Type of test, e.g., performance/functional.
    test_model_id: Model ID, e.g., qw35/qw45.
    sample_count: Number of samples, default 100.

Returns:
    CreateTestDataResponse: Structured response with test data info.
"""
# ... (mock logic to build response)

return CreateTestDataResponse(
    data_url=data_url,
    status="success",
    url=url,
    test_type=test_type,
    model=test_model_id,
    total_count=sample_count,
    examples=example_objects,
    message="Test data created successfully"
)

In the on_tool_end callback or in LangSmith traces, I expect the tool's output to be accessible as a structured dictionary or JSON-compatible object (ideally preserving Pydantic types), so that downstream processing can access nested fields like examples[0].prompt directly.

Actual Behavior
The output in on_tool_end is received as a formatted string, for example:

data_url='data-68280' status='success' url='https://data.example.com/data-68280' test_type='functional' model='qw35' total_count=20 examples=[TestDataExample(prompt='1+1=?', max_tokens=None, expected='2'), ...] message='测试数据创建成功'
This makes it difficult to parse or extract structured information without fragile string parsing.

Question
How can I ensure that the tool preserves the return value as a JSON-serializable dict or raw Pydantic object in callbacks/tracing, instead of being converted to a string representation?

Is this related to serialization settings in @tool, or is there a way to configure the run tracer to keep structured outputs?

Any guidance or workaround would be appreciated!

Description

Question
How can I ensure that the tool preserves the return value as a JSON-serializable dict or raw Pydantic object in callbacks/tracing, instead of being converted to a string representation?

Is this related to serialization settings in @tool, or is there a way to configure the run tracer to keep structured outputs?

Any guidance or workaround would be appreciated!

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 23.5.0: Wed May 1 20:14:59 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T8122
Python Version: 3.11.9 (v3.11.9:de54cf5be3, Apr 2 2024, 07:12:50) [Clang 13.0.0 (clang-1300.0.29.30)]

Package Information

langchain_core: 0.3.74
langchain: 0.3.27
langchain_community: 0.3.27
langsmith: 0.4.4
langchain_anthropic: 0.3.3
langchain_aws: 0.2.18
langchain_mcp_adapters: 0.1.9
langchain_openai: 0.3.1
langchain_text_splitters: 0.3.9
langchainhub: 0.1.15
langgraph_sdk: 0.1.74

Optional packages not installed

langserve

Other Dependencies

aiohttp<4.0.0,>=3.8.3: Installed. No version info available.
anthropic: 0.64.0
async-timeout<5.0.0,>=4.0.0;: Installed. No version info available.
boto3: 1.39.9
dataclasses-json<0.7,>=0.5.7: Installed. No version info available.
defusedxml: 0.7.1
httpx: 0.27.0
httpx-sse<1.0.0,>=0.4.0: Installed. No version info available.
httpx>=0.25.2: Installed. No version info available.
jsonpatch<2.0,>=1.33: Installed. No version info available.
langchain-anthropic;: Installed. No version info available.
langchain-aws;: Installed. No version info available.
langchain-azure-ai;: Installed. No version info available.
langchain-cohere;: Installed. No version info available.
langchain-community;: Installed. No version info available.
langchain-core<0.4,>=0.3.36: Installed. No version info available.
langchain-core<1.0.0,>=0.3.66: Installed. No version info available.
langchain-core<1.0.0,>=0.3.72: Installed. No version info available.
langchain-deepseek;: Installed. No version info available.
langchain-fireworks;: Installed. No version info available.
langchain-google-genai;: Installed. No version info available.
langchain-google-vertexai;: Installed. No version info available.
langchain-groq;: Installed. No version info available.
langchain-huggingface;: Installed. No version info available.
langchain-mistralai;: Installed. No version info available.
langchain-ollama;: Installed. No version info available.
langchain-openai;: Installed. No version info available.
langchain-perplexity;: Installed. No version info available.
langchain-text-splitters<1.0.0,>=0.3.9: Installed. No version info available.
langchain-together;: Installed. No version info available.
langchain-xai;: Installed. No version info available.
langchain<1.0.0,>=0.3.26: Installed. No version info available.
langsmith-pyo3: Installed. No version info available.
langsmith>=0.1.125: Installed. No version info available.
langsmith>=0.1.17: Installed. No version info available.
langsmith>=0.3.45: Installed. No version info available.
mcp>=1.9.2: Installed. No version info available.
numpy: 1.26.4
numpy>=1.26.2;: Installed. No version info available.
numpy>=2.1.0;: Installed. No version info available.
openai: 1.58.1
openai-agents: Installed. No version info available.
opentelemetry-api: 1.36.0
opentelemetry-exporter-otlp-proto-http: Installed. No version info available.
opentelemetry-sdk: 1.36.0
orjson: 3.11.0
orjson>=3.10.1: Installed. No version info available.
packaging: 24.2
packaging>=23.2: Installed. No version info available.
pydantic: 2.11.7
pydantic-settings<3.0.0,>=2.4.0: Installed. No version info available.
pydantic<3.0.0,>=2.7.4: Installed. No version info available.
pydantic>=2.7.4: Installed. No version info available.
pytest: 7.4.0
PyYAML>=5.3: Installed. No version info available.
requests: 2.32.4
requests-toolbelt: 1.0.0
requests<3,>=2: Installed. No version info available.
rich: 14.0.0
SQLAlchemy<3,>=1.4: Installed. No version info available.
tenacity!=8.4.0,<10,>=8.1.0: Installed. No version info available.
tenacity!=8.4.0,<10.0.0,>=8.1.0: Installed. No version info available.
tiktoken: 0.9.0
types-requests: 2.32.4.20250611
typing-extensions>=4.14.0: Installed. No version info available.
typing-extensions>=4.7: Installed. No version info available.
zstandard: 0.23.0

Answer 1 · 2025-09-15T23:53:48.000Z

Hello，I recommend trying the response_format="content_and_artifact" parameter in your @tool decorator. This should solve the problem of Pydantic objects being stringified in the on_tool_end callback.
Here's how to modify your tool:

@tool(parse_docstring=True, return_direct=False, response_format="content_and_artifact")
async def create_test_data(test_type: str, test_model_id: str, sample_count: int = 100) -> tuple[str, CreateTestDataResponse]:
    """Create test data."""
    # Your existing logic to build the response...
    response = CreateTestDataResponse(
        data_url=data_url,
        status="success",
        url=url,
        test_type=test_type,
        model=test_model_id,
        total_count=sample_count,
        examples=example_objects,
        message="Test data created successfully"
    )
    
    # Return a tuple: (content_for_model, structured_data)
    summary = f"Successfully created {sample_count} test examples for {test_type} testing"
    return summary, response

The key changes:
Add response_format="content_and_artifact" to your @tool decorator
Change your return type to tuple[str, CreateTestDataResponse]
Return a tuple: (summary_message, pydantic_object)
In your on_tool_end callback, you can then access the structured data via:

def on_tool_end(self, output: Any, **kwargs) -> None:
    if hasattr(output, 'artifact') and output.artifact:
        structured_data = output.artifact  # This will be your CreateTestDataResponse object
        # Now you can access: structured_data.examples[0].prompt

This approach preserves the original Pydantic object in the artifact field while providing a clean summary to the model in the content field.

Answer 2 · 2025-09-16T00:22:14.000Z

Hello，I recommend trying the response_format="content_and_artifact" parameter in your @tool decorator. This should solve the problem of Pydantic objects being stringified in the on_tool_end callback. Here's how to modify your tool:
@tool(parse_docstring=True, return_direct=False, response_format="content_and_artifact")
async def create_test_data(test_type: str, test_model_id: str, sample_count: int = 100) -> tuple[str, CreateTestDataResponse]:
    """Create test data."""
    # Your existing logic to build the response...
    response = CreateTestDataResponse(
        data_url=data_url,
        status="success",
        url=url,
        test_type=test_type,
        model=test_model_id,
        total_count=sample_count,
        examples=example_objects,
        message="Test data created successfully"
    )
    
    # Return a tuple: (content_for_model, structured_data)
    summary = f"Successfully created {sample_count} test examples for {test_type} testing"
    return summary, response
The key changes: Add response_format="content_and_artifact" to your @tool decorator Change your return type to tuple[str, CreateTestDataResponse] Return a tuple: (summary_message, pydantic_object) In your on_tool_end callback, you can then access the structured data via:
def on_tool_end(self, output: Any, **kwargs) -> None:
    if hasattr(output, 'artifact') and output.artifact:
        structured_data = output.artifact  # This will be your CreateTestDataResponse object
        # Now you can access: structured_data.examples[0].prompt
This approach preserves the original Pydantic object in the artifact field while providing a clean summary to the model in the content field.

Thank you so much for your suggestion! I've tried using response_format="content_and_artifact" as you recommended, and it works perfectly. By returning a tuple of (summary, pydantic_object) and accessing the structured data via output.artifact in the on_tool_end callback, I'm now able to preserve the full Pydantic object without it being stringified.

For anyone else facing a similar issue, here’s how I handled the output:
async def on_tool_end(self, output: Any, **kwargs: Any) -> None:
tool_output = to_serializable(output.artifact or output.content)
def to_serializable(obj):
"""
serializable
"""
if isinstance(obj, BaseModel):
return obj.model_dump()
elif isinstance(obj, list):
return [to_serializable(i) for i in obj]
elif isinstance(obj, dict):
return {k: to_serializable(v) for k, v in obj.items()}
elif isinstance(obj, str):
try:
return json.loads(obj)
except:
return obj
else:
return obj