Receive BadRequest 400 when invoking GPT-OSS-20B

Question

Receive BadRequest 400 when invoking GPT-OSS-20B

Opened this issue 2 months ago · 4 comments

LIN-Yu-Ting commented 2 months ago

Checked other resources

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Example Code

We have a local GPT-OSS-20B model run on vLLM.

We first load it in ChatOpenAI model.

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
    model="openai/gpt-oss-20b",
    base_url="http://localhost/v1",
    api_key="EMPTY",
    temperature=0.0,
    max_tokens=10000,
    timeout=180,
    max_retries=2,
    use_responses_api=True
)

Note: We are able to invoke this llm with simple HumanMessage.

We then define a very simple tool as following.

from langchain_core.tools import tool
 
@tool
def get_weather(location: str) -> str:
    """
    Get the current weather for a location.
    Args:
        location: name of location that you want to check.
    return:
        return general weather and temperature
    """
    if location.lower() == "paris":
        return "Sunny, 25°C"
    elif location.lower() == "london":
        return "Cloudy, 18°C"
    else:
        return f"No weather data for {location}"

We are able to obtain a tool_call response with

from langchain_core.messages import HumanMessage
llm_with_tools = llm.bind_tools([get_weather])
messages = [HumanMessage(content="How is the weather in Paris?")]
message = llm_with_tools.invoke(messages)
message

where message looks like this:

AIMessage(content=[], additional_kwargs={'reasoning': {'id': 'rs_5390005f10d84a2ead227a8931a6ec15', 'summary': [], 'type': 'reasoning', 'content': [{'text': 'We need to use the get_weather function.', 'type': 'reasoning_text'}]}, '__openai_function_call_ids__': {'call_9b56ba57d2f54bfe9e974ddca1241199': 'ft_9b56ba57d2f54bfe9e974ddca1241199'}}, response_metadata={'id': 'resp_0eb4781a55174b179f6a3431fcbfaccb', 'created_at': 1758171845.0, 'model': 'openai/gpt-oss-20b', 'object': 'response', 'service_tier': 'auto', 'status': 'completed', 'model_name': 'openai/gpt-oss-20b'}, id='run--2593891c-28d2-4c90-84b2-0123e2e56f34-0', tool_calls=[{'name': 'get_weather', 'args': {'location': 'Paris'}, 'id': 'call_9b56ba57d2f54bfe9e974ddca1241199', 'type': 'tool_call'}], usage_metadata={'input_tokens': 0, 'output_tokens': 0, 'total_tokens': 0, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 0}})

By executing the tool, we obtain the ToolMessage

ToolMessage(content='Sunny, 25°C', name='get_weather', id='3e6108e8-96f4-4c28-a17a-f85aefe51cbb', tool_call_id='call_9b56ba57d2f54bfe9e974ddca1241199')]}

If we append the AIMessage and ToolMessage with the original messages and invoke another time the llm_with_tools. Unfortunately, we received the following exception.

Error Message and Stack Trace (if applicable)

---------------------------------------------------------------------------BadRequestError                           Traceback (most recent call last)
Cell In[12], line 2      
          1 llm_with_tools = llm.bind_tools(tools=[get_weather], strict=True)
----> 2 message = llm_with_tools.invoke(messages)
          3 message
File /usr/local/lib/python3.10/dist-packages/langchain_core/runnables/base.py:5710, in RunnableBindingBase.invoke(self, input, config, **kwargs)
   5703 @override
   5704 def invoke(
   5705     self,
   (...)
   5708     **kwargs: Optional[Any],
   5709 ) -> Output:
-> 5710     return self.bound.invoke(
   5711         input,
   5712         self._merge_configs(config),
   5713         **{**self.kwargs, **kwargs},
   5714     )
File /usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py:395, in BaseChatModel.invoke(self, input, config, stop, **kwargs)
    383 @override
    384 def invoke(
    385     self,
   (...)
    390     **kwargs: Any,
    391 ) -> BaseMessage:
    392     config = ensure_config(config)
    393     return cast(
    394         "ChatGeneration",
--> 395         self.generate_prompt(
    396             [self._convert_input(input)],
    397             stop=stop,
    398             callbacks=config.get("callbacks"),
    399             tags=config.get("tags"),
    400             metadata=config.get("metadata"),
    401             run_name=config.get("run_name"),
    402             run_id=config.pop("run_id", None),
    403             **kwargs,
    404         ).generations[0][0],
    405     ).message
File /usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py:1023, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, **kwargs)
   1014 @override
   1015 def generate_prompt(
   1016     self,
   (...)
   1020     **kwargs: Any,
   1021 ) -> LLMResult:
   1022     prompt_messages = [p.to_messages() for p in prompts]
-> 1023     return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
File /usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py:840, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)
    837 for i, m in enumerate(input_messages):
    838     try:
    839         results.append(
--> 840             self._generate_with_cache(
    841                 m,
    842                 stop=stop,
    843                 run_manager=run_managers[i] if run_managers else None,
    844                 **kwargs,
    845             )
    846         )
    847     except BaseException as e:
    848         if run_managers:
File /usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py:1089, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, **kwargs)
   1087     result = generate_from_stream(iter(chunks))
   1088 elif inspect.signature(self._generate).parameters.get("run_manager"):
-> 1089     result = self._generate(
   1090         messages, stop=stop, run_manager=run_manager, **kwargs
   1091     )
   1092 else:
   1093     result = self._generate(messages, stop=stop, **kwargs)
File /usr/local/lib/python3.10/dist-packages/langchain_openai/chat_models/base.py:1184, in BaseChatOpenAI._generate(self, messages, stop, run_manager, **kwargs)
   1182     if raw_response is not None and hasattr(raw_response, "http_response"):
   1183         e.response = raw_response.http_response  # type: ignore[attr-defined]
-> 1184     raise e
   1185 if (
   1186     self.include_response_headers
   1187     and raw_response is not None
   1188     and hasattr(raw_response, "headers")
   1189 ):
   1190     generation_info = {"headers": dict(raw_response.headers)}
File /usr/local/lib/python3.10/dist-packages/langchain_openai/chat_models/base.py:1166, in BaseChatOpenAI._generate(self, messages, stop, run_manager, **kwargs)
   1162     raw_response = self.root_client.responses.with_raw_response.parse(
   1163         **payload
   1164     )
   1165 else:-
> 1166     raw_response = self.root_client.responses.with_raw_response.create(
   1167         **payload
   1168     )
   1169 response = raw_response.parse()
   1170 if self.include_response_headers:
File /usr/local/lib/python3.10/dist-packages/openai/_legacy_response.py:364, in to_raw_response_wrapper.<locals>.wrapped(*args, **kwargs)
    360 extra_headers[RAW_RESPONSE_HEADER] = "true"
    362 kwargs["extra_headers"] = extra_headers
--> 364 return cast(LegacyAPIResponse[R], func(*args, **kwargs))
File /usr/local/lib/python3.10/dist-packages/openai/resources/responses/responses.py:828, in Responses.create(self, background, conversation, include, input, instructions, max_output_tokens, max_tool_calls, metadata, model, parallel_tool_calls, previous_response_id, prompt, prompt_cache_key, reasoning, safety_identifier, service_tier, store, stream, stream_options, temperature, text, tool_choice, tools, top_logprobs, top_p, truncation, user, extra_headers, extra_query, extra_body, timeout)
    791 def create(
    792     self,
    793     *,
   (...)
    826     timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
    827 ) -> Response | Stream[ResponseStreamEvent]:
--> 828     return self._post(
    829         "/responses",
    830         body=maybe_transform(
    831             {
    832                 "background": background,
    833                 "conversation": conversation,
    834                 "include": include,
    835                 "input": input,
    836                 "instructions": instructions,
    837                 "max_output_tokens": max_output_tokens,
    838                 "max_tool_calls": max_tool_calls,
    839                 "metadata": metadata,
    840                 "model": model,
    841                 "parallel_tool_calls": parallel_tool_calls,
    842                 "previous_response_id": previous_response_id,
    843                 "prompt": prompt,
    844                 "prompt_cache_key": prompt_cache_key,
    845                 "reasoning": reasoning,
    846                 "safety_identifier": safety_identifier,
    847                 "service_tier": service_tier,
    848                 "store": store,
    849                 "stream": stream,
    850                 "stream_options": stream_options,
    851                 "temperature": temperature,
    852                 "text": text,
    853                 "tool_choice": tool_choice,
    854                 "tools": tools,
    855                 "top_logprobs": top_logprobs,
    856                 "top_p": top_p,
    857                 "truncation": truncation,
    858                 "user": user,
    859             },
    860             response_create_params.ResponseCreateParamsStreaming
    861             if stream
    862             else response_create_params.ResponseCreateParamsNonStreaming,
    863         ),
    864         options=make_request_options(
    865             extra_headers=extra_headers, extra_query=extra_query, extra_body=extra_body, timeout=timeout
    866         ),
    867         cast_to=Response,
    868         stream=stream or False,
    869         stream_cls=Stream[ResponseStreamEvent],
    870     )
File /usr/local/lib/python3.10/dist-packages/openai/_base_client.py:1259, in SyncAPIClient.post(self, path, cast_to, body, options, files, stream, stream_cls)
   1245 def post(
   1246     self,
   1247     path: str,
   (...)
   1254     stream_cls: type[_StreamT] | None = None,
   1255 ) -> ResponseT | _StreamT:
   1256     opts = FinalRequestOptions.construct(
   1257         method="post", url=path, json_data=body, files=to_httpx_files(files), **options
   1258     )
-> 1259     return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File /usr/local/lib/python3.10/dist-packages/openai/_base_client.py:1047, in SyncAPIClient.request(self, cast_to, options, stream, stream_cls)
   1044             err.response.read()
   1046         log.debug("Re-raising status error")
-> 1047         raise self._make_status_error_from_response(err.response) from None
   1049     break
   1051 assert response is not None, "could not resolve response (should never happen)"
BadRequestError: Error code: 400 - {'object': 'error', 'message': "object of type 'pydantic_core._pydantic_core.ValidatorIterator' has no len() None", 'type': 'BadRequestError', 'param': None, 'code': 400}

Description

I would like to use local GPT-OSS-20B model as my llm for tool_calling. It does not generate correct AIMessage with tool_calls, if I do not specify use_responses_api=True. However, if I provide this argument then I will encounter the above exception.

We try to discuss this in #32885. However, I use another approach and I believe this is not related to langgraph as I do not use any create_react_agent function here.

System Info

System Information

OS: Linux
OS Version: #1 SMP PREEMPT_DYNAMIC Wed May 1 15:46:25 EDT 2024
Python Version: 3.10.12 (main, May 27 2025, 17:12:29) [GCC 11.4.0]

Package Information

langchain_core: 0.3.76
langchain: 0.3.27
langchain_community: 0.3.25
langsmith: 0.3.45
langchain_chroma: 0.2.4
langchain_google_genai: 2.1.6
langchain_openai: 0.3.33
langchain_text_splitters: 0.3.11
langgraph_sdk: 0.1.74

Optional packages not installed

langserve

Other Dependencies

aiohttp<4.0.0,>=3.8.3: Installed. No version info available.
async-timeout<5.0.0,>=4.0.0;: Installed. No version info available.
chromadb>=1.0.9: Installed. No version info available.
dataclasses-json<0.7,>=0.5.7: Installed. No version info available.
filetype: 1.2.0
google-ai-generativelanguage: 0.6.18
httpx: 0.28.1
httpx-sse<1.0.0,>=0.4.0: Installed. No version info available.
httpx>=0.25.2: Installed. No version info available.
jsonpatch<2.0,>=1.33: Installed. No version info available.
langchain-anthropic;: Installed. No version info available.
langchain-aws;: Installed. No version info available.
langchain-azure-ai;: Installed. No version info available.
langchain-cohere;: Installed. No version info available.
langchain-community;: Installed. No version info available.
langchain-core<1.0.0,>=0.3.65: Installed. No version info available.
langchain-core<1.0.0,>=0.3.72: Installed. No version info available.
langchain-core<1.0.0,>=0.3.76: Installed. No version info available.
langchain-core<2.0.0,>=0.3.75: Installed. No version info available.
langchain-core>=0.3.60: Installed. No version info available.
langchain-deepseek;: Installed. No version info available.
langchain-fireworks;: Installed. No version info available.
langchain-google-genai;: Installed. No version info available.
langchain-google-vertexai;: Installed. No version info available.
langchain-groq;: Installed. No version info available.
langchain-huggingface;: Installed. No version info available.
langchain-mistralai;: Installed. No version info available.
langchain-ollama;: Installed. No version info available.
langchain-openai;: Installed. No version info available.
langchain-perplexity;: Installed. No version info available.
langchain-text-splitters<1.0.0,>=0.3.9: Installed. No version info available.
langchain-together;: Installed. No version info available.
langchain-xai;: Installed. No version info available.
langchain<1.0.0,>=0.3.25: Installed. No version info available.
langsmith-pyo3: Installed. No version info available.
langsmith<0.4,>=0.1.125: Installed. No version info available.
langsmith>=0.1.17: Installed. No version info available.
langsmith>=0.3.45: Installed. No version info available.
numpy>=1.26.0;: Installed. No version info available.
numpy>=1.26.2;: Installed. No version info available.
numpy>=2.1.0;: Installed. No version info available.
openai-agents: Installed. No version info available.
openai<2.0.0,>=1.104.2: Installed. No version info available.
opentelemetry-api: 1.37.0
opentelemetry-exporter-otlp-proto-http: Installed. No version info available.
opentelemetry-sdk: 1.37.0
orjson: 3.10.18
orjson>=3.10.1: Installed. No version info available.
packaging: 25.0
packaging>=23.2: Installed. No version info available.
pydantic: 2.11.9
pydantic-settings<3.0.0,>=2.4.0: Installed. No version info available.
pydantic<3.0.0,>=2.7.4: Installed. No version info available.
pydantic>=2.7.4: Installed. No version info available.
pytest: Installed. No version info available.
PyYAML>=5.3: Installed. No version info available.
requests: 2.31.0
requests-toolbelt: 1.0.0
requests<3,>=2: Installed. No version info available.
rich: 14.1.0
SQLAlchemy<3,>=1.4: Installed. No version info available.
tenacity!=8.4.0,<10,>=8.1.0: Installed. No version info available.
tenacity!=8.4.0,<10.0.0,>=8.1.0: Installed. No version info available.
tiktoken<1,>=0.7: Installed. No version info available.
typing-extensions>=4.7: Installed. No version info available.
zstandard: 0.23.0

Answer 1 · 2025-09-18T13:55:33.000Z

use_responses_api=True

vllm doesn't yet support the OpenAI Responses API. vllm-project/vllm#14721

Unset the param and it should hopefully work (having issues with vllm on my MacBook; can't test)

Answer 2 · 2025-09-18T15:03:30.000Z

@mdrxy I have tried without use_responses_api=True.

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
    model="openai/gpt-oss-20b",
    base_url="http://localhost/v1",
    api_key="EMPTY",
    temperature=0.0,
    max_tokens=10000,
    timeout=180,
    max_retries=2
)

After bind_tools, my llm_with_tools can not return a AIMessage with tool_calls information. If I create my agent with langgraph create_react_agent function, then I will receive something like.

Answer 3 · 2025-09-29T06:54:57.000Z

Hi, I faced the same issue... and just wondered if you have solve this problem or not..

Answer 4 · 2025-10-02T13:27:13.000Z

Hi @wldlsyy. I temporarily used openai-agents SDK to interact with GPS-OSS-20B model. You can install it by 'pip install openai-agents' while waiting any solutions to come later.