langchain-ai/langchainjs

ChatAnthropic: Allow Multiple Tool Result Blocks + Tool Content Blocks, currently throws Error

Closed this issue · 4 comments

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

The following order of messages is resulting in errors when the ToolMessage content are not strings, despite being the valid content type.

For a payload of messages like this:
HumanMessage -> AIMessage (with 2 tool calls) -> ToolMessage -> ToolMessage

_convertMessagesToAnthropicPayload converts them as follows, which results in the error:

[
  {
    role: "user",
    content: "Hi Danny here. Please get me 2 random images and then describe both please.",
  },
  {
    role: "assistant",
    content: [
      {
        type: "text",
        text: "Hello Danny! It's great to hear from you. I'd be happy to fetch two random images for you and describe them. Let's get started with retrieving the images using the available tool.",
      },
      {
        type: "tool_use",
        id: "toolu_01LG1s6Qob5txwi1HZjp4CPW",
        name: "fetchRandomImage",
        input: {
        },
      },
      {
        type: "tool_use",
        id: "toolu_01PhLTgBCR4Qa9GJwS8QqsLy",
        name: "fetchRandomImage",
        input: {
        },
      },
    ],
  },
  {
    role: "user",
    content: [
      {
        type: "tool_result",
        content: [
          {
            type: "text",
            text: "Random image from Lorem Picsum, taken at 800x600",
          },
        ],
        tool_use_id: "toolu_01LG1s6Qob5txwi1HZjp4CPW",
      },
    ],
  },
  {
    role: "user",
    content: [
      {
        type: "tool_result",
        content: [
          {
            type: "text",
            text: "Random image from Lorem Picsum, taken at 800x600",
          },
          {
            type: "image",
            source: {
              type: "base64",
              media_type: "image/jpeg",
              data: "base64string",
            },
          },
        ],
        tool_use_id: "toolu_01PhLTgBCR4Qa9GJwS8QqsLy",
      },
    ],
  },
]

Error Message and Stack Trace (if applicable)

Creates a bad request to Anthropic API:

https://js.langchain.com/docs/troubleshooting/errors/INVALID_TOOL_RESULTS/

Description

Anthropic API supports multiple tool_result blocks in one user content field, as well as multiple content blocks in the tool_result content.

Source: https://docs.anthropic.com/en/docs/build-with-claude/tool-use#handling-tool-use-and-tool-result-content-blocks

I've tested this a bit, and this cURL example shows this well (fill in your own base64 images)

curl --location 'https://api.anthropic.com/v1/messages' \
--header 'content-type: application/json' \
--header 'x-api-key: redacted-api-key' \
--header 'anthropic-version: 2023-06-01' \
--data '{
  "model": "claude-3-5-sonnet-20241022",
  "max_tokens": 8192,
  "tools": [
    {
      "name": "fetchRandomImage",
      "description": "Fetches a random image from Lorem Picsum",
      "input_schema": {
        "type": "object",
        "properties": {},
        "required": []
      }
    }
  ],
  "messages": [
    {
      "role": "user",
      "content": "Hi Jo here. Please get me 2 random images and then describe both please."
    },
    {
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "Hello Jo! I'\''d be happy to fetch two random images for you and describe them. Let'\''s get started with retrieving the images using the available tool."
        },
        {
          "type": "tool_use",
          "id": "toolu_01XtLYBDStQzpAAwUgSSz2fS",
          "name": "fetchRandomImage",
          "input": {}
        },
        {
          "type": "tool_use",
          "id": "toolu_016Cy1WFkPdk3ecdQnt4D9Bq",
          "name": "fetchRandomImage",
          "input": {}
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "tool_result",
          "tool_use_id": "toolu_01XtLYBDStQzpAAwUgSSz2fS",
          "content": [
            {
              "type": "text",
              "text": "Random image from Lorem Picsum, taken at 800x600"
            },
            {
              "type": "image",
              "source": {
                "type": "base64",
                "media_type": "image/jpeg",
                "data": "base64"
              }
            }
          ]
        },
        {
          "type": "tool_result",
          "tool_use_id": "toolu_016Cy1WFkPdk3ecdQnt4D9Bq",
          "content": [
            {
              "type": "text",
              "text": "Random image from Lorem Picsum, taken at 800x600"
            },
            {
              "type": "image",
              "source": {
                "type": "base64",
                "media_type": "image/jpeg",
                "data": "base64"
              }
            }
          ]
        }
      ]
    }
  ]
}'

Currently, @langchain/anthropic does not take this possibility into account, making dynamic/multi-modal experiences harder to implement.

The only workaround is to structure the AIMessages and ToolMessages in alternating order, with separate tool call concerns. This is really awkward and inefficient when constructing the payload.

Note that langchain itself constructs the bad request based on the API responses:
HumanMessage -> AIMessage (with 2 tool calls) -> ToolMessage -> ToolMessage

This would work if content for ToolMessages were strings, but I need them to be content arrays (as a raw API request would accept).

System Info

I'm using @langchain/anthropic@0.3.8

npm info langchain:

langchain@0.3.7 | MIT | deps: 12 | versions: 302
Typescript bindings for langchain
https://github.com/langchain-ai/langchainjs/tree/main/langchain/

keywords: llm, ai, gpt3, chain, prompt, prompt engineering, chatgpt, machine learning, ml, openai, embeddings, vectorstores

dist
.tarball: https://registry.npmjs.org/langchain/-/langchain-0.3.7.tgz
.shasum: a0d2010a9f2dab473eaebfaac5228fac7c2a6f17
.integrity: sha512-6/Gkk9Zez3HkbsETFxZVo1iKLmaK3OzkDseC5MYFKVmYFDXFAOyJR3srJ9P61xF8heVdsPixqYIsejBn7/9dXg==
.unpackedSize: 2.9 MB

dependencies:
@langchain/openai: >=0.1.0 <0.4.0        js-yaml: ^4.1.0                          openapi-types: ^12.1.3                   yaml: ^2.2.1                             
@langchain/textsplitters: >=0.0.0 <0.2.0 jsonpointer: ^5.0.1                      p-retry: 4                               zod-to-json-schema: ^3.22.3              
js-tiktoken: ^1.0.12                     langsmith: ^0.2.8                        uuid: ^10.0.0                            zod: ^3.22.4                             

maintainers:
- nfcampos <nuno@boringbits.io>
- jacoblee93 <jacoblee93@gmail.com>
- andrewnguonly <andrewnguonly@gmail.com>
- davidduong <david@duong.cz>
- hwchase17 <hw.chase.17@gmail.com>
- basproul <braceasproul@gmail.com>

dist-tags:
latest: 0.3.7                              next: 0.3.2-rc.0                           tag-for-publishing-older-releases: 0.2.20  

published 4 days ago by jacoblee93 <jacoblee93@gmail.com>

To address the "INVALID_TOOL_RESULTS" error when using the Anthropic API with LangChain.js, ensure that the ToolMessages are formatted correctly. The error arises because the content of ToolMessages is expected to be strings, but you are using arrays. The Anthropic API does support multiple tool_result blocks within a single user content field, but the current implementation in @langchain/anthropic does not accommodate this.

Here's how you can resolve the issue:

  1. Format ToolMessages as Strings: Convert the content of each ToolMessage to a string. This is necessary because the current implementation expects string content for ToolMessages.

  2. Combine Tool Results: If the API supports multiple tool_result blocks, you can combine these into a single user message with multiple tool_result entries, each associated with its respective tool_call_id.

  3. Adjust Message Structure: Ensure that the structure of your messages aligns with the expected format. You might need to adjust the order or content of your messages to fit the current limitations of the @langchain/anthropic package.

Here's an example of how to format the messages correctly:

const messageHistory = [
  new HumanMessage("Hi Danny here. Please get me 2 random images and then describe both please."),
  new AIMessage({
    content: "",
    tool_calls: [
      {
        name: "fetchRandomImage",
        id: "toolu_01LG1s6Qob5txwi1HZjp4CPW",
        args: {},
      },
      {
        name: "fetchRandomImage",
        id: "toolu_01PhLTgBCR4Qa9GJwS8QqsLy",
        args: {},
      },
    ],
  }),
  new ToolMessage({
    name: "fetchRandomImage",
    tool_call_id: "toolu_01LG1s6Qob5txwi1HZjp4CPW",
    content: "Random image from Lorem Picsum, taken at 800x600",
  }),
  new ToolMessage({
    name: "fetchRandomImage",
    tool_call_id: "toolu_01PhLTgBCR4Qa9GJwS8QqsLy",
    content: "Random image from Lorem Picsum, taken at 800x600",
  }),
];

const formattedMessages = _convertMessagesToAnthropicPayload(messageHistory);

This approach ensures that each ToolMessage is correctly formatted as a string and corresponds to a tool_call in the AIMessage [1][2].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

hey @danny-avila thanks for flagging this! Will TAL now

@danny-avila fix is released on version 0.3.11

Please ping me if you run into more issues!

Awesome, works like a charm!