langchain-ai/langchain

Receive BadRequest 400 when invoking GPT-OSS-20B

Opened this issue · 4 comments

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Example Code

We have a local GPT-OSS-20B model run on vLLM.

We first load it in ChatOpenAI model.

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
    model="openai/gpt-oss-20b",
    base_url="http://localhost/v1",
    api_key="EMPTY",
    temperature=0.0,
    max_tokens=10000,
    timeout=180,
    max_retries=2,
    use_responses_api=True
)

Note: We are able to invoke this llm with simple HumanMessage.

We then define a very simple tool as following.

from langchain_core.tools import tool
 
@tool
def get_weather(location: str) -> str:
    """
    Get the current weather for a location.
    Args:
        location: name of location that you want to check.
    return:
        return general weather and temperature
    """
    if location.lower() == "paris":
        return "Sunny, 25°C"
    elif location.lower() == "london":
        return "Cloudy, 18°C"
    else:
        return f"No weather data for {location}"

We are able to obtain a tool_call response with

from langchain_core.messages import HumanMessage
llm_with_tools = llm.bind_tools([get_weather])
messages = [HumanMessage(content="How is the weather in Paris?")]
message = llm_with_tools.invoke(messages)
message

where message looks like this:

AIMessage(content=[], additional_kwargs={'reasoning': {'id': 'rs_5390005f10d84a2ead227a8931a6ec15', 'summary': [], 'type': 'reasoning', 'content': [{'text': 'We need to use the get_weather function.', 'type': 'reasoning_text'}]}, '__openai_function_call_ids__': {'call_9b56ba57d2f54bfe9e974ddca1241199': 'ft_9b56ba57d2f54bfe9e974ddca1241199'}}, response_metadata={'id': 'resp_0eb4781a55174b179f6a3431fcbfaccb', 'created_at': 1758171845.0, 'model': 'openai/gpt-oss-20b', 'object': 'response', 'service_tier': 'auto', 'status': 'completed', 'model_name': 'openai/gpt-oss-20b'}, id='run--2593891c-28d2-4c90-84b2-0123e2e56f34-0', tool_calls=[{'name': 'get_weather', 'args': {'location': 'Paris'}, 'id': 'call_9b56ba57d2f54bfe9e974ddca1241199', 'type': 'tool_call'}], usage_metadata={'input_tokens': 0, 'output_tokens': 0, 'total_tokens': 0, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 0}})

By executing the tool, we obtain the ToolMessage

ToolMessage(content='Sunny, 25°C', name='get_weather', id='3e6108e8-96f4-4c28-a17a-f85aefe51cbb', tool_call_id='call_9b56ba57d2f54bfe9e974ddca1241199')]}

If we append the AIMessage and ToolMessage with the original messages and invoke another time the llm_with_tools. Unfortunately, we received the following exception.

Error Message and Stack Trace (if applicable)

---------------------------------------------------------------------------BadRequestError                           Traceback (most recent call last)
Cell In[12], line 2      
          1 llm_with_tools = llm.bind_tools(tools=[get_weather], strict=True)
----> 2 message = llm_with_tools.invoke(messages)
          3 message
File /usr/local/lib/python3.10/dist-packages/langchain_core/runnables/base.py:5710, in RunnableBindingBase.invoke(self, input, config, **kwargs)
   5703 @override
   5704 def invoke(
   5705     self,
   (...)
   5708     **kwargs: Optional[Any],
   5709 ) -> Output:
-> 5710     return self.bound.invoke(
   5711         input,
   5712         self._merge_configs(config),
   5713         **{**self.kwargs, **kwargs},
   5714     )
File /usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py:395, in BaseChatModel.invoke(self, input, config, stop, **kwargs)
    383 @override
    384 def invoke(
    385     self,
   (...)
    390     **kwargs: Any,
    391 ) -> BaseMessage:
    392     config = ensure_config(config)
    393     return cast(
    394         "ChatGeneration",
--> 395         self.generate_prompt(
    396             [self._convert_input(input)],
    397             stop=stop,
    398             callbacks=config.get("callbacks"),
    399             tags=config.get("tags"),
    400             metadata=config.get("metadata"),
    401             run_name=config.get("run_name"),
    402             run_id=config.pop("run_id", None),
    403             **kwargs,
    404         ).generations[0][0],
    405     ).message
File /usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py:1023, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, **kwargs)
   1014 @override
   1015 def generate_prompt(
   1016     self,
   (...)
   1020     **kwargs: Any,
   1021 ) -> LLMResult:
   1022     prompt_messages = [p.to_messages() for p in prompts]
-> 1023     return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
File /usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py:840, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)
    837 for i, m in enumerate(input_messages):
    838     try:
    839         results.append(
--> 840             self._generate_with_cache(
    841                 m,
    842                 stop=stop,
    843                 run_manager=run_managers[i] if run_managers else None,
    844                 **kwargs,
    845             )
    846         )
    847     except BaseException as e:
    848         if run_managers:
File /usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py:1089, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, **kwargs)
   1087     result = generate_from_stream(iter(chunks))
   1088 elif inspect.signature(self._generate).parameters.get("run_manager"):
-> 1089     result = self._generate(
   1090         messages, stop=stop, run_manager=run_manager, **kwargs
   1091     )
   1092 else:
   1093     result = self._generate(messages, stop=stop, **kwargs)
File /usr/local/lib/python3.10/dist-packages/langchain_openai/chat_models/base.py:1184, in BaseChatOpenAI._generate(self, messages, stop, run_manager, **kwargs)
   1182     if raw_response is not None and hasattr(raw_response, "http_response"):
   1183         e.response = raw_response.http_response  # type: ignore[attr-defined]
-> 1184     raise e
   1185 if (
   1186     self.include_response_headers
   1187     and raw_response is not None
   1188     and hasattr(raw_response, "headers")
   1189 ):
   1190     generation_info = {"headers": dict(raw_response.headers)}
File /usr/local/lib/python3.10/dist-packages/langchain_openai/chat_models/base.py:1166, in BaseChatOpenAI._generate(self, messages, stop, run_manager, **kwargs)
   1162     raw_response = self.root_client.responses.with_raw_response.parse(
   1163         **payload
   1164     )
   1165 else:-
> 1166     raw_response = self.root_client.responses.with_raw_response.create(
   1167         **payload
   1168     )
   1169 response = raw_response.parse()
   1170 if self.include_response_headers:
File /usr/local/lib/python3.10/dist-packages/openai/_legacy_response.py:364, in to_raw_response_wrapper.<locals>.wrapped(*args, **kwargs)
    360 extra_headers[RAW_RESPONSE_HEADER] = "true"
    362 kwargs["extra_headers"] = extra_headers
--> 364 return cast(LegacyAPIResponse[R], func(*args, **kwargs))
File /usr/local/lib/python3.10/dist-packages/openai/resources/responses/responses.py:828, in Responses.create(self, background, conversation, include, input, instructions, max_output_tokens, max_tool_calls, metadata, model, parallel_tool_calls, previous_response_id, prompt, prompt_cache_key, reasoning, safety_identifier, service_tier, store, stream, stream_options, temperature, text, tool_choice, tools, top_logprobs, top_p, truncation, user, extra_headers, extra_query, extra_body, timeout)
    791 def create(
    792     self,
    793     *,
   (...)
    826     timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
    827 ) -> Response | Stream[ResponseStreamEvent]:
--> 828     return self._post(
    829         "/responses",
    830         body=maybe_transform(
    831             {
    832                 "background": background,
    833                 "conversation": conversation,
    834                 "include": include,
    835                 "input": input,
    836                 "instructions": instructions,
    837                 "max_output_tokens": max_output_tokens,
    838                 "max_tool_calls": max_tool_calls,
    839                 "metadata": metadata,
    840                 "model": model,
    841                 "parallel_tool_calls": parallel_tool_calls,
    842                 "previous_response_id": previous_response_id,
    843                 "prompt": prompt,
    844                 "prompt_cache_key": prompt_cache_key,
    845                 "reasoning": reasoning,
    846                 "safety_identifier": safety_identifier,
    847                 "service_tier": service_tier,
    848                 "store": store,
    849                 "stream": stream,
    850                 "stream_options": stream_options,
    851                 "temperature": temperature,
    852                 "text": text,
    853                 "tool_choice": tool_choice,
    854                 "tools": tools,
    855                 "top_logprobs": top_logprobs,
    856                 "top_p": top_p,
    857                 "truncation": truncation,
    858                 "user": user,
    859             },
    860             response_create_params.ResponseCreateParamsStreaming
    861             if stream
    862             else response_create_params.ResponseCreateParamsNonStreaming,
    863         ),
    864         options=make_request_options(
    865             extra_headers=extra_headers, extra_query=extra_query, extra_body=extra_body, timeout=timeout
    866         ),
    867         cast_to=Response,
    868         stream=stream or False,
    869         stream_cls=Stream[ResponseStreamEvent],
    870     )
File /usr/local/lib/python3.10/dist-packages/openai/_base_client.py:1259, in SyncAPIClient.post(self, path, cast_to, body, options, files, stream, stream_cls)
   1245 def post(
   1246     self,
   1247     path: str,
   (...)
   1254     stream_cls: type[_StreamT] | None = None,
   1255 ) -> ResponseT | _StreamT:
   1256     opts = FinalRequestOptions.construct(
   1257         method="post", url=path, json_data=body, files=to_httpx_files(files), **options
   1258     )
-> 1259     return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File /usr/local/lib/python3.10/dist-packages/openai/_base_client.py:1047, in SyncAPIClient.request(self, cast_to, options, stream, stream_cls)
   1044             err.response.read()
   1046         log.debug("Re-raising status error")
-> 1047         raise self._make_status_error_from_response(err.response) from None
   1049     break
   1051 assert response is not None, "could not resolve response (should never happen)"
BadRequestError: Error code: 400 - {'object': 'error', 'message': "object of type 'pydantic_core._pydantic_core.ValidatorIterator' has no len() None", 'type': 'BadRequestError', 'param': None, 'code': 400}

Description

I would like to use local GPT-OSS-20B model as my llm for tool_calling. It does not generate correct AIMessage with tool_calls, if I do not specify use_responses_api=True. However, if I provide this argument then I will encounter the above exception.

We try to discuss this in #32885. However, I use another approach and I believe this is not related to langgraph as I do not use any create_react_agent function here.

System Info

System Information

OS: Linux
OS Version: #1 SMP PREEMPT_DYNAMIC Wed May 1 15:46:25 EDT 2024
Python Version: 3.10.12 (main, May 27 2025, 17:12:29) [GCC 11.4.0]

Package Information

langchain_core: 0.3.76
langchain: 0.3.27
langchain_community: 0.3.25
langsmith: 0.3.45
langchain_chroma: 0.2.4
langchain_google_genai: 2.1.6
langchain_openai: 0.3.33
langchain_text_splitters: 0.3.11
langgraph_sdk: 0.1.74

Optional packages not installed

langserve

Other Dependencies

aiohttp<4.0.0,>=3.8.3: Installed. No version info available.
async-timeout<5.0.0,>=4.0.0;: Installed. No version info available.
chromadb>=1.0.9: Installed. No version info available.
dataclasses-json<0.7,>=0.5.7: Installed. No version info available.
filetype: 1.2.0
google-ai-generativelanguage: 0.6.18
httpx: 0.28.1
httpx-sse<1.0.0,>=0.4.0: Installed. No version info available.
httpx>=0.25.2: Installed. No version info available.
jsonpatch<2.0,>=1.33: Installed. No version info available.
langchain-anthropic;: Installed. No version info available.
langchain-aws;: Installed. No version info available.
langchain-azure-ai;: Installed. No version info available.
langchain-cohere;: Installed. No version info available.
langchain-community;: Installed. No version info available.
langchain-core<1.0.0,>=0.3.65: Installed. No version info available.
langchain-core<1.0.0,>=0.3.72: Installed. No version info available.
langchain-core<1.0.0,>=0.3.76: Installed. No version info available.
langchain-core<2.0.0,>=0.3.75: Installed. No version info available.
langchain-core>=0.3.60: Installed. No version info available.
langchain-deepseek;: Installed. No version info available.
langchain-fireworks;: Installed. No version info available.
langchain-google-genai;: Installed. No version info available.
langchain-google-vertexai;: Installed. No version info available.
langchain-groq;: Installed. No version info available.
langchain-huggingface;: Installed. No version info available.
langchain-mistralai;: Installed. No version info available.
langchain-ollama;: Installed. No version info available.
langchain-openai;: Installed. No version info available.
langchain-perplexity;: Installed. No version info available.
langchain-text-splitters<1.0.0,>=0.3.9: Installed. No version info available.
langchain-together;: Installed. No version info available.
langchain-xai;: Installed. No version info available.
langchain<1.0.0,>=0.3.25: Installed. No version info available.
langsmith-pyo3: Installed. No version info available.
langsmith<0.4,>=0.1.125: Installed. No version info available.
langsmith>=0.1.17: Installed. No version info available.
langsmith>=0.3.45: Installed. No version info available.
numpy>=1.26.0;: Installed. No version info available.
numpy>=1.26.2;: Installed. No version info available.
numpy>=2.1.0;: Installed. No version info available.
openai-agents: Installed. No version info available.
openai<2.0.0,>=1.104.2: Installed. No version info available.
opentelemetry-api: 1.37.0
opentelemetry-exporter-otlp-proto-http: Installed. No version info available.
opentelemetry-sdk: 1.37.0
orjson: 3.10.18
orjson>=3.10.1: Installed. No version info available.
packaging: 25.0
packaging>=23.2: Installed. No version info available.
pydantic: 2.11.9
pydantic-settings<3.0.0,>=2.4.0: Installed. No version info available.
pydantic<3.0.0,>=2.7.4: Installed. No version info available.
pydantic>=2.7.4: Installed. No version info available.
pytest: Installed. No version info available.
PyYAML>=5.3: Installed. No version info available.
requests: 2.31.0
requests-toolbelt: 1.0.0
requests<3,>=2: Installed. No version info available.
rich: 14.1.0
SQLAlchemy<3,>=1.4: Installed. No version info available.
tenacity!=8.4.0,<10,>=8.1.0: Installed. No version info available.
tenacity!=8.4.0,<10.0.0,>=8.1.0: Installed. No version info available.
tiktoken<1,>=0.7: Installed. No version info available.
typing-extensions>=4.7: Installed. No version info available.
zstandard: 0.23.0

use_responses_api=True

vllm doesn't yet support the OpenAI Responses API. vllm-project/vllm#14721

Unset the param and it should hopefully work (having issues with vllm on my MacBook; can't test)

@mdrxy I have tried without use_responses_api=True.

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
    model="openai/gpt-oss-20b",
    base_url="http://localhost/v1",
    api_key="EMPTY",
    temperature=0.0,
    max_tokens=10000,
    timeout=180,
    max_retries=2
)

After bind_tools, my llm_with_tools can not return a AIMessage with tool_calls information. If I create my agent with langgraph create_react_agent function, then I will receive something like.

Image

Hi, I faced the same issue... and just wondered if you have solve this problem or not..

Hi @wldlsyy. I temporarily used openai-agents SDK to interact with GPS-OSS-20B model. You can install it by 'pip install openai-agents' while waiting any solutions to come later.