langchain-ai/langchainjs

Streaming does not work when using `includeRaw` with structured output

Stadly opened this issue · 1 comments

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({
  modelName: "gpt-4o",
  streaming: true,
  streamUsage: true,
})
  .withStructuredOutput(
    {
      title: "Joke",
      description: "Joke to tell user.",
      type: "object",
      properties: {
        setup: {
          type: "string",
          description: "The setup for the joke",
        },
        punchline: {
          type: "string",
          description: "The joke's punchline",
        },
      },
      required: ["setup", "punchline"],
      additionalProperties: false,
    },
    {
      strict: true,
      method: "jsonSchema",
      includeRaw: true,
    },
  )
  .withConfig({ runName: "joke" });

const eventStream = model.streamEvents(
  "Tell me a joke about cats",
  { version: "v2" },
  { includeNames: ["joke"] },
);
for await (const event of eventStream) {
  console.log(event);
}

Error Message and Stack Trace (if applicable)

{
  event: 'on_chain_start',
  data: { input: 'Tell me a joke about cats' },
  name: 'joke',
  tags: [],
  run_id: '71252832-c230-4f66-8ba4-df0b535bd85b',
  metadata: {}
}
{
  event: 'on_chain_stream',
  run_id: '71252832-c230-4f66-8ba4-df0b535bd85b',
  name: 'joke',
  tags: [],
  metadata: {},
  data: {
    chunk: {
      raw: AIMessageChunk {
        "id": "chatcmpl-AO2a99YEuVXe52RikBUvE4hZAcaEO",
        "content": "{\"setup\":\"Why was the cat sitting on the computer?\",\"punchline\":\"Because it wanted to keep an eye on the mouse!\"}",
        "additional_kwargs": {},
        "response_metadata": {
          "prompt": 0,
          "completion": 0,
          "usage": {
            "completion_tokens": 29,
            "prompt_tokens": 85,
            "total_tokens": 114
          },
          "finish_reason": "stop",
          "system_fingerprint": "fp_d54531d9eb"
        },
        "tool_calls": [],
        "tool_call_chunks": [],
        "invalid_tool_calls": [],
        "usage_metadata": {
          "input_tokens": 85,
          "output_tokens": 29,
          "total_tokens": 114
        }
      }
    }
  }
}
{
  event: 'on_chain_stream',
  run_id: '71252832-c230-4f66-8ba4-df0b535bd85b',
  name: 'joke',
  tags: [],
  metadata: {},
  data: { chunk: { parsed: [Object] } }
}
{
  event: 'on_chain_end',
  data: {
    output: {
      raw: AIMessageChunk {
        "id": "chatcmpl-AO2a99YEuVXe52RikBUvE4hZAcaEO",
        "content": "{\"setup\":\"Why was the cat sitting on the computer?\",\"punchline\":\"Because it wanted to keep an eye on the mouse!\"}",
        "additional_kwargs": {},
        "response_metadata": {
          "prompt": 0,
          "completion": 0,
          "usage": {
            "completion_tokens": 29,
            "prompt_tokens": 85,
            "total_tokens": 114
          },
          "finish_reason": "stop",
          "system_fingerprint": "fp_d54531d9eb"
        },
        "tool_calls": [],
        "tool_call_chunks": [],
        "invalid_tool_calls": [],
        "usage_metadata": {
          "input_tokens": 85,
          "output_tokens": 29,
          "total_tokens": 114
        }
      },
      parsed: [Object]
    }
  },
  run_id: '71252832-c230-4f66-8ba4-df0b535bd85b',
  name: 'joke',
  tags: [],
  metadata: {}
}

Description

I would have expected a series of stream events, and not just two - one with the final raw message and one with the final parsed message.

System Info

  • node: v20.18.0
  • @langchain/community: 0.3.11
  • @langchain/core: 0.3.16
  • @langchain/openai: 0.3.11

I couldn't find a specific answer to your issue within the repository, but there might be related information elsewhere or it could have been overlooked. However, I did find a similar closed issue that might be relevant: RunnableWithFallbacks doesn't implement streaming support. This issue was resolved by a pull request that added streaming support for RunnableWithFallbacks [1].

Your issue seems to involve the includeRaw option with structured output not producing the expected series of stream events. This might be related to how the streamEvents method handles structured output when includeRaw is set to true, but unfortunately, I couldn't find specific documentation or examples addressing this exact scenario.

It might be helpful to check if there are any updates or patches related to streaming and structured output in the latest versions of LangChain.js or to explore the implementation details of the streamEvents method to understand how it processes includeRaw with structured output.

To continue talking to Dosu, mention @dosu.