langchain-ai/langchain

`create_agent()` fails with `RunnableWithFallbacks` when using middleware

Opened this issue · 3 comments

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Example Code

from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware
from langchain_openai import ChatOpenAI
from langchain.tools import tool

# Setup models
main_model = ChatOpenAI(model="gpt-4")
fallback_model = ChatOpenAI(model="gpt-3.5-turbo")

# Create a model with fallback - this returns RunnableWithFallbacks
model_with_fallback = main_model.with_fallbacks([fallback_model])

# Setup middleware for conversation summarization
middleware = [
    SummarizationMiddleware(
        model=main_model,
        max_tokens_before_summary=10000,
        messages_to_keep=20
    )
]

# Define a simple tool for the agent
@tool
def get_weather(city: str) -> str:
    """Get weather information for a city."""
    return f"The weather in {city} is sunny."

# This fails with AssertionError
agent = create_agent(
    model=model_with_fallback,  # RunnableWithFallbacks type
    prompt="You are a helpful assistant that can provide weather information.",
    tools=[get_weather],
    middleware=middleware  # When middleware exists, assertion fails
)

# Run the agent
result = agent.invoke({"input": "What's the weather in New York?"})
print(result)

Error Message and Stack Trace (if applicable)

Traceback (most recent call last):
  File "test_agent.py", line 24, in <module>
    agent = create_agent(
  File "/path/to/langchain/agents/react_agent.py", line 1145, in create_agent
    assert isinstance(model, str | BaseChatModel)
AssertionError

The assertion error occurs at line 1145 in react_agent.py:

if middleware:
    assert isinstance(model, str | BaseChatModel)  # This fails for RunnableWithFallbacks

Description

What I'm trying to do:
I'm building a production AI agent that needs both fallback models for reliability (handling rate limits, API failures) and middleware for memory management (conversation summarization, token limits). I want to use create_agent() with a model that has fallbacks attached while also utilizing middleware features.

What I expect to happen:
The create_agent() function should accept models wrapped with with_fallbacks() and properly work with both the fallback chain and middleware features. This is a common production requirement where we need:

  • Primary model with automatic fallback to a secondary model
  • Middleware for managing long conversations and token limits

What is currently happening:
When calling create_agent() with both a RunnableWithFallbacks model (created via model.with_fallbacks()) and middleware enabled, the function raises an AssertionError. The type check at line 1145 only accepts str | BaseChatModel but doesn't account for RunnableWithFallbacks, which is a wrapper class that contains a BaseChatModel but isn't a direct subclass of it.

The type hierarchy shows:

  • RunnableWithFallbacksRunnableSerializableRunnable
  • BaseChatModelBaseLanguageModelRunnableSerializable

They share RunnableSerializable as a common ancestor but are in different inheritance branches, causing the type check to fail.

Current workaround (loses functionality):

# This workaround defeats the purpose of having fallbacks
if isinstance(model, RunnableWithFallbacks):
    model = model.runnable  # Loses fallback capability

This forces users to choose between reliability (fallbacks) OR memory management (middleware), but not both, which significantly impacts production deployments.

System Info

System Information
------------------
> OS:             Linux
> OS Version:     6.6.87.2-microsoft-standard-WSL2
> Python Version: 3.12.8

Package Information
-------------------
> langchain: 1.0.0a9
> langgraph: 1.0.0a3

> langchain_core: 0.3.29
> langchain_community: 0.3.17
> langsmith: 0.2.11
> langchain_openai: 0.2.14


Other Dependencies
-----------------
> aiohttp: 3.11.11
> async-timeout: 5.0.1
> httpx: 0.28.1
> jsonpatch: 1.33
> numpy: 1.26.4
> openai: 1.57.4
> orjson: 3.10.13
> packaging: 24.2
> pydantic: 2.10.5
> PyYAML: 6.0.2
> requests: 2.32.3
> SQLAlchemy: 2.0.37
> tenacity: 9.0.0
> tiktoken: 0.8.0
> typing-extensions: 4.12.2

Impact: This issue affects production systems that require both reliability features (fallback models) and conversation management (middleware), which is a common pattern in enterprise AI applications.

Suggested Fix: The type check should be extended to include RunnableWithFallbacks or use duck typing to check for required methods rather than strict type checking:

# Option 1: Extend type check
if middleware:
    assert isinstance(model, str | BaseChatModel | RunnableWithFallbacks)

# Option 2: Duck typing (more flexible)
if middleware:
    if not isinstance(model, str):
        required_attrs = ['invoke', 'stream', 'batch']
        assert all(hasattr(model, attr) for attr in required_attrs), \
            f"Model must have {required_attrs} methods"

I'm happy to submit a PR with the fix if the maintainers agree on the approach. This would help many users who need both fallback reliability and middleware features in production environments.

cc @sydney-runkle -- I suspect users can't do this through modifyModelRequest either right now. We could probably modify the behavior of with_fallbacks for chat models to be specific to chat models so we get back a chat model instance.

Technical Analysis and Solution

Hi @bart0401 and @eyurtsev,

This is indeed a critical issue that affects production deployments where both reliability (fallbacks) and conversation management (middleware) are essential. I've analyzed the problem and can provide some insights.

Root Cause Analysis

The issue stems from the strict type checking in create_agent() at line 1145:

if middleware:
    assert isinstance(model, str | BaseChatModel)

The inheritance hierarchy shows why this fails:

  • RunnableWithFallbacks inherits from RunnableSerializable
  • BaseChatModel inherits from BaseLanguageModelRunnableSerializable
  • Both share RunnableSerializable as a common ancestor but are in different branches

Proposed Solutions

Option 1: Extend Type Check (Immediate Fix)

from langchain_core.runnables.fallbacks import RunnableWithFallbacks

if middleware:
    assert isinstance(model, str | BaseChatModel | RunnableWithFallbacks)

Option 2: Duck Typing (More Robust)

if middleware:
    if not isinstance(model, str):
        # Check for required methods instead of strict type checking
        required_methods = ['invoke', 'ainvoke', 'stream', 'astream']
        missing_methods = [method for method in required_methods 
                          if not hasattr(model, method)]
        if missing_methods:
            raise TypeError(f"Model must implement methods: {missing_methods}")

Option 3: Unwrap Fallback Models (Backward Compatible)

from langchain_core.runnables.fallbacks import RunnableWithFallbacks

if middleware:
    # Extract the primary model if it's wrapped in fallbacks
    actual_model = model
    if isinstance(model, RunnableWithFallbacks):
        # Get the primary runnable (first in the chain)
        actual_model = model.runnable
    
    assert isinstance(actual_model, str | BaseChatModel)

Impact on Middleware Functionality

The middleware system works by intercepting the model's invoke/stream methods. With RunnableWithFallbacks, the middleware will:

  1. Apply to the entire fallback chain (desired behavior)
  2. Maintain fallback logic when primary model fails
  3. Continue working with conversation summarization

Testing Considerations

We should ensure:

  • Middleware applies correctly to fallback chains
  • Fallback behavior is preserved when middleware is active
  • Memory management works across model switches
  • Performance impact is minimal

Recommended Approach

I recommend Option 1 for immediate fix followed by Option 2 for long-term robustness. This approach:

  • ✅ Fixes the immediate issue
  • ✅ Maintains backward compatibility
  • ✅ Supports future runnable types
  • ✅ Provides better error messages

Would you like me to prepare a PR with the implementation? I can include comprehensive tests covering:

  • Basic fallback + middleware functionality
  • Error handling scenarios
  • Performance benchmarks
  • Documentation updates

Gabriel

We will be addressing this issue using middleware to handle model fallbacks.