Unable to retrieve raw LLM response on JSON parsing error during structured output with retries; subsequent retries are extremely slow

Question

Unable to retrieve raw LLM response on JSON parsing error during structured output with retries; subsequent retries are extremely slow

Opened this issue 2 months ago · 0 comments

Checked other resources

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Example Code

When using ChatOpenAI with .with_structured_output(..., include_raw=True), if the underlying API returns an invalid or malformed JSON response (e.g., due to network issues, partial response, or model output error), a JSONDecodeError is raised deep in the call stack.

The main problems are:

No access to raw response: Even though include_raw=True is set, when a JSON parsing error occurs, LangChain immediately raises the exception without returning the raw response via the raw field in the result.
Subsequent retries are extremely slow: After a failure and retry (via max_retries=3), the next invocation takes significantly longer than usual — sometimes tens of seconds or more — even though the input and context remain unchanged.
This makes it difficult to:

Debug what malformed response was actually returned by the LLM.
Understand whether the issue originates from OpenAI's side (e.g., incomplete streaming response) or post-processing.
Diagnose why retries become so slow.

from typing import List
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage

class Response(BaseModel):
    """
    Structured reasoning output to determine next agent and input.
    """
    todolist: List[str] = Field(description="[Optional] Execution plan steps", default=[])
    current_todo_status: str = Field(description="[Optional] Status description of current plan", default="")
    next_agent: str = Field(description="[Required] Next agent to invoke")
    next_agent_input: str = Field(description="[Required] Natural language instruction for next agent or final summary")
    reasoning: str = Field(description="[Required] Reasoning path: from chat history → agent selection → input generation")

# Initialize LLM
llm = ChatOpenAI(
    model='openai/gpt-4.1-mini',  # openrouter
    temperature=0.1,
    timeout=180,
    max_tokens=30000,
    max_retries=3
)

# Use structured output with raw included
llm_with_structured_output = llm.with_structured_output(Response, include_raw=True)

# Example chat history
chat_history = [
    HumanMessage(content="Plan a trip to Tokyo next month."),
    AIMessage(content="Okay, I'll help you plan your trip.")
]

# Async invoke that may fail
try:
    llm_output = await llm_with_structured_output.ainvoke(chat_history)
except Exception as e:
    print(e)  # JSONDecodeError thrown, but no way to get raw response

Expected Behavior
When include_raw=True, even if JSON parsing fails, the raw HTTP response body should be accessible in the returned object or attached to the exception.
Retries should maintain consistent performance unless rate-limited — slowness suggests potential backoff or stuck state.
Actual Behavior
json.decoder.JSONDecodeError: Expecting value: line 859 column 1 (char 4719)

Error Message and Stack Trace (if applicable)

    llm_output = await llm_with_structured_output.ainvoke(chat_history, config=self.config)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rushant

/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 3088, in ainvoke
input_ = await coro_with_context(part(), context, create_task=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 3816, in ainvoke
results = await asyncio.gather(
^^^^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 3808, in _ainvoke_step
return await coro_with_context(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 5447, in ainvoke
return await self.bound.ainvoke(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 417, in ainvoke
llm_result = await self.agenerate_prompt(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 991, in agenerate_prompt
return await self.agenerate(
^^^^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 949, in agenerate
raise exceptions[0]
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 1117, in _agenerate_with_cache
result = await self._agenerate(
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/langchain_openai/chat_models/base.py", line 950, in _agenerate
response = await self.root_async_client.beta.chat.completions.parse(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/openai/resources/beta/chat/completions.py", line 435, in parse
return await self._post(
^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/openai/_base_client.py", line 1843, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/openai/_base_client.py", line 1537, in request
return await self._request(
^^^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/openai/_base_client.py", line 1640, in _request
return await self._process_response(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/openai/_base_client.py", line 1737, in _process_response
return await api_response.parse()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/openai/_response.py", line 424, in parse
parsed = self._parse(to=to)
^^^^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/openai/_response.py", line 259, in _parse
data = response.json()
^^^^^^^^^^^^^^^
File "/Users/rushant
/Documents/baidu/nlp-qa/tools-factory/.venv/lib/python3.11/site-packages/httpx/_models.py", line 764, in json
return jsonlib.loads(self.content, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/init.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 859 column 1 (char 4719)

Description

Expected Behavior

When include_raw=True, even if JSON parsing fails, the raw HTTP response body should be accessible in the returned object or attached to the exception.
Retries should maintain consistent performance unless rate-limited — slowness suggests potential backoff or stuck state.

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 23.5.0: Wed May 1 20:14:59 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T8122
Python Version: 3.11.9 (v3.11.9:de54cf5be3, Apr 2 2024, 07:12:50) [Clang 13.0.0 (clang-1300.0.29.30)]

Package Information

langchain_core: 0.3.74
langchain: 0.3.27
langchain_community: 0.3.27
langsmith: 0.4.4
langchain_anthropic: 0.3.3
langchain_aws: 0.2.18
langchain_mcp_adapters: 0.1.9
langchain_openai: 0.3.1
langchain_text_splitters: 0.3.9
langchainhub: 0.1.15
langgraph_sdk: 0.1.74

Optional packages not installed

langserve

Other Dependencies

aiohttp<4.0.0,>=3.8.3: Installed. No version info available.
anthropic: 0.64.0
async-timeout<5.0.0,>=4.0.0;: Installed. No version info available.
boto3: 1.39.9
dataclasses-json<0.7,>=0.5.7: Installed. No version info available.
defusedxml: 0.7.1
httpx: 0.27.0
httpx-sse<1.0.0,>=0.4.0: Installed. No version info available.
httpx>=0.25.2: Installed. No version info available.
jsonpatch<2.0,>=1.33: Installed. No version info available.
langchain-anthropic;: Installed. No version info available.
langchain-aws;: Installed. No version info available.
langchain-azure-ai;: Installed. No version info available.
langchain-cohere;: Installed. No version info available.
langchain-community;: Installed. No version info available.
langchain-core<0.4,>=0.3.36: Installed. No version info available.
langchain-core<1.0.0,>=0.3.66: Installed. No version info available.
langchain-core<1.0.0,>=0.3.72: Installed. No version info available.
langchain-deepseek;: Installed. No version info available.
langchain-fireworks;: Installed. No version info available.
langchain-google-genai;: Installed. No version info available.
langchain-google-vertexai;: Installed. No version info available.
langchain-groq;: Installed. No version info available.
langchain-huggingface;: Installed. No version info available.
langchain-mistralai;: Installed. No version info available.
langchain-ollama;: Installed. No version info available.
langchain-openai;: Installed. No version info available.
langchain-perplexity;: Installed. No version info available.
langchain-text-splitters<1.0.0,>=0.3.9: Installed. No version info available.
langchain-together;: Installed. No version info available.
langchain-xai;: Installed. No version info available.
langchain<1.0.0,>=0.3.26: Installed. No version info available.
langsmith-pyo3: Installed. No version info available.
langsmith>=0.1.125: Installed. No version info available.
langsmith>=0.1.17: Installed. No version info available.
langsmith>=0.3.45: Installed. No version info available.
mcp>=1.9.2: Installed. No version info available.
numpy: 1.26.4
numpy>=1.26.2;: Installed. No version info available.
numpy>=2.1.0;: Installed. No version info available.
openai: 1.58.1
openai-agents: Installed. No version info available.
opentelemetry-api: 1.36.0
opentelemetry-exporter-otlp-proto-http: Installed. No version info available.
opentelemetry-sdk: 1.36.0
orjson: 3.11.0
orjson>=3.10.1: Installed. No version info available.
packaging: 24.2
packaging>=23.2: Installed. No version info available.
pydantic: 2.11.7
pydantic-settings<3.0.0,>=2.4.0: Installed. No version info available.
pydantic<3.0.0,>=2.7.4: Installed. No version info available.
pydantic>=2.7.4: Installed. No version info available.
pytest: 7.4.0
PyYAML>=5.3: Installed. No version info available.
requests: 2.32.4
requests-toolbelt: 1.0.0
requests<3,>=2: Installed. No version info available.
rich: 14.0.0
SQLAlchemy<3,>=1.4: Installed. No version info available.
tenacity!=8.4.0,<10,>=8.1.0: Installed. No version info available.
tenacity!=8.4.0,<10.0.0,>=8.1.0: Installed. No version info available.
tiktoken: 0.9.0
types-requests: 2.32.4.20250611
typing-extensions>=4.14.0: Installed. No version info available.
typing-extensions>=4.7: Installed. No version info available.
zstandard: 0.23.0