encode/starlette

Background tasks don't work with middleware that subclasses `BaseHTTPMiddleware`

retnikt opened this issue ยท 43 comments

When using background tasks with middleware, requests are not processed until the background task has finished.

  1. Use the example below
  2. Make several requests in a row - works as expected
  3. Uncomment app.add_middleware(TransparentMiddleware) and re-run
  4. Make several requests in a row - subsequent ones are not processed until the 10 second sleep has finished (the first request returns before then though).

The same behaviour occurs with asyncio.sleep (async) and time.sleep (sync, run in threadpool)

import asyncio
import uvicorn
from starlette.applications import Starlette
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.background import BackgroundTask
from starlette.responses import JSONResponse


class TransparentMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        # simple middleware that does absolutely nothing
        response = await call_next(request)
        return response


app = Starlette(debug=True)
# uncomment to break
# app.add_middleware(TransparentMiddleware)


@app.route("/")
async def test(_):
    task = BackgroundTask(asyncio.sleep, 10)
    return JSONResponse("hello", background=task)


if __name__ == '__main__':
    uvicorn.run(app)
JayH5 commented

What versions of Python, Starlette, and Uvicorn are you using? I'm not able to recreate with Python 3.8.2, Starlette 0.13.3, and Uvicorn 0.11.5.

Python 3.8.2, Uvicorn 0.11.4, Starlette 0.13.2. I'll try updating to see if that makes any difference.

Nope, now on Starlette 0.13.3 and Uvicorn 0.11.5, but I'm still getting the same behaviour. What OS are you on? I'm using Arch Linux

I have the same issue (Windows 10, Python 3.7, Starlette 0.13.4, Uvicorn 0.11.5). Is there any workaround?

Thanks for the hint @retnikt, will try that.

Having the same issue. My background task is blocking and needs to run in the thread_pool. Everything works as expected without middleware, but adding middleware causes some requests to block.

If I run this sample and I make a number of requests with wrk with five connections, I see the first five return immediately, and then the remaining requests all hang until wrk ceases after which I see the following exceptions (one for each connection) all stating that the handler is closed:

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/miniconda3/envs/starlette/lib/python3.8/site-packages/uvicorn/protocols/http/httptools_impl.py", line 385, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/miniconda3/envs/starlette/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
    return await self.app(scope, receive, send)
  File "./starlette/applications.py", line 111, in __call__
    await self.middleware_stack(scope, receive, send)
  File "./starlette/middleware/errors.py", line 181, in __call__
    raise exc from None
  File "./starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "./starlette/middleware/base.py", line 26, in __call__
    await response(scope, receive, send)
  File "./starlette/responses.py", line 228, in __call__
    await run_until_first_complete(
  File "./starlette/concurrency.py", line 18, in run_until_first_complete
    [task.result() for task in done]
  File "./starlette/concurrency.py", line 18, in <listcomp>
    [task.result() for task in done]
  File "./starlette/responses.py", line 225, in stream_response
    await send({"type": "http.response.body", "body": b"", "more_body": False})
  File "./starlette/middleware/errors.py", line 156, in _send
    await send(message)
  File "/miniconda3/envs/starlette/lib/python3.8/site-packages/uvicorn/protocols/http/httptools_impl.py", line 510, in send
    self.transport.write(body)
  File "uvloop/handles/stream.pyx", line 673, in uvloop.loop.UVStream.write
  File "uvloop/handles/handle.pyx", line 159, in uvloop.loop.UVHandle._ensure_alive
RuntimeError: unable to perform operation on <TCPTransport closed=True reading=False 0x7fdb3008d990>; the handler is closed

Only five requests are successfully handled in this case.

However, if I modify the TransparentMiddleware class given here to have a simple __call__ instead of dispatch, there are no exceptions and it will return 2,500 requests per second without dropping any:

class TransparentMiddleware(BaseHTTPMiddleware):
    async def __call__(self, scope, receive, send) -> None:
         if scope["type"] != "http":
             await self.app(scope, receive, send)
             return

         await self.app(scope, receive, send)

More interesting, if I add some logging calls, we can see when things are firing. First, in the exceptions case:

class TransparentMiddleware(BaseHTTPMiddleware):
    # async def __call__(self, scope, receive, send) -> None:
    #     if scope["type"] != "http":
    #         await self.app(scope, receive, send)
    #         return

    #     log.info("__calling__ the middlewares!")
    #     await self.app(scope, receive, send)

    async def dispatch(self, request, call_next):
        # simple middleware that does absolutely nothing
        log.info("dispatching!")
        response = await call_next(request)
        return response

async def some_sleeper():
    log.info("sleeping!")
    await asyncio.sleep(10)
    log.info("waking up now!")

@app.route("/")
async def test(_):
    task = BackgroundTask(some_sleeper)
    return JSONResponse("hello", background=task)


if __name__ == '__main__':
    uvicorn.run(app, host="0.0.0.0", port=8000)

Results in the following:

โฏ python -m starlette_middleware_background
INFO:     Started server process [53410]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO:     dispatching!
INFO:     dispatching!
INFO:     sleeping!
INFO:     sleeping!
INFO:     dispatching!
INFO:     dispatching!
INFO:     dispatching!
INFO:     sleeping!
INFO:     sleeping!
INFO:     sleeping!
INFO:     127.0.0.1:59302 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:59303 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:59304 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:59305 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:59306 - "GET / HTTP/1.1" 200 OK
INFO:     waking up now!
INFO:     waking up now!
INFO:     waking up now!
INFO:     waking up now!
INFO:     waking up now!
ERROR:    Exception in ASGI application
Traceback (most recent call last):
...[EXCEPTIONS: one per connection]...

In the no-exceptions case things are interleaved in a manner closer to what I would have expected:

INFO:     __calling__ the middlewares!
INFO:     127.0.0.1:59324 - "GET / HTTP/1.1" 200 OK
INFO:     sleeping!
INFO:     __calling__ the middlewares!
INFO:     127.0.0.1:59321 - "GET / HTTP/1.1" 200 OK
INFO:     sleeping!
INFO:     __calling__ the middlewares!
INFO:     127.0.0.1:59325 - "GET / HTTP/1.1" 200 OK
INFO:     sleeping!
INFO:     waking up now!
INFO:     __calling__ the middlewares!
INFO:     127.0.0.1:59325 - "GET / HTTP/1.1" 200 OK
INFO:     sleeping!
INFO:     __calling__ the middlewares!
INFO:     127.0.0.1:59325 - "GET / HTTP/1.1" 200 OK
INFO:     sleeping!
...
INFO:     waking up now!
INFO:     waking up now!
INFO:     waking up now!
INFO:     waking up now!
INFO:     waking up now!
INFO:     waking up now!
...

The difference in behavior comes down to this call_next function.

As written the call_next function starts a StreamingResponse, but dose not exit until the app being wrapped has completed. This includes background tasks. Some clients such as curl will close the connection as soon as they receive Content-Length bytes, which makes it appear as if the first few responses are returning quickly. curl --ignore-content-length and disabling Keep-Alive can help debug this.

I'm not an expert in asyncio or ASGI, so I'm not sure if this fix has unintended consequences....

Changing:

async def body_stream() -> typing.AsyncGenerator[bytes, None]:
while True:
message = await queue.get()
if message is None:
break
assert message["type"] == "http.response.body"
yield message.get("body", b"")
task.result()

To:

        async def body_stream() -> typing.AsyncGenerator[bytes, None]:
            while True:
                message = await queue.get()
                if message is None:
                    break
                assert message["type"] == "http.response.body"
                yield message.get("body", b"")
                if not message.get("more_body"):
                    break
            # task.result()

allows the StreamingResponse to finish sending data to the client while the task is run in the background.

The notable thing to me is that dispatch in the BaseHTTPMiddleware is using a StreamingResponse and building this tasks-and-queues thing whereas none of the other middleware provided by starlette (examples include authentication, error-handling, etc.) use this same pattern. Why is that? These other classes are all simple classes with a __call__ method defined: no StreamingResponse included.

In my own projects, I've defined a handful of middleware classes, but I haven't implemented dispatch or call_next and I haven't inherited from BaseHTTPMiddleware: I have always naively built mine like the other provided middleware by writing a __call__ method.

@retnikt can you try to implement your middleware with a __call__ method and see if it "fixes" the problem (not a real fix, just a validation that we're seeing the same issue)?

I think at the very least that the order of execution is inconsistent between these two implementations. I'm also not sure why the BaseHTTPMiddleware is the only one to offer streaming by default.

@erewok yes, this issue doesn't happen with __call__. BTW I think I mentioned that in the OP.

@erewok yes, this issue doesn't happen with __call__. BTW I think I mentioned that in the OP.

I have a similar issue in my current project I'm working on.

We are using BackgroundTask for one long operation we have. Lately, I tried to add Middleware inherited from BaseHTTPMiddleware, which is supposed to open SQLAlchemy session, inject it into request and close on request processing finished.

After doing so, I had BackgroundTask functionality broken and frontend waiting for it to finish before realising the request is done.

I saw a reference to current issue in this MR. And it looked like it could solve this issue.

Unfortunately, until it is merged I had to come up with the solution. And, basically, I tried using new version of BaseHTTPMiddleware from that MR, adding my session logic into its dispatch function.
And the issue got solved for me. Background task is running in the background and middleware is doing what it is expected to do.

So for anyone interested in this issue solved, this should happen after that MR gets reviewed and merged :)

Huge thanks to @erewok as this issue appeared to be quite of a blocker. Because it'd be hard for me to use __call__ function as I couldn't modify request.state value there.

Somehow hypercorn is a workaround for this issue.

Hi all, I had a PR to fix this behavior (#1017 ) but it relied on forcing the response to remain unevaluated, which further confused those users who were relying on having a fully-evaluated response inside their middleware functions (#1022 ). In other words, there are contradictory goals here.

After looking at this BaseHTTPMiddleware class over the last week, I have come to the following conclusions:

  • We should discourage the use of BaseHTTPMiddleware in favor of raw ASGI middleware which includes a method def __call__(self, scope, receive, send):, and

  • If people still want to use this middleware class, it should never be used with StreamingResponse/FileResponse endpoints.

Unfortunately, while the BaseHTTPMiddleware class offers an easier interface because it provides a request argument and promises something like a response, this middleware class also encourages users to think about the asgi app functionality in a "complete" or "finished" way. This means this class will either load the entirety of streaming requests into memory (#1012 ) and run the background before returning the response (this issue), or if we fix those problems, that it will then encourage users to leave resources in a pending or open state, an arguably worse result. In short, it's problematic.

Again, these problems should be absent if you avoid subclassing BaseHTTPMiddleware.

Lastly, I think it's worth leaving this issue open so that other users who have the same problem can see it and see that it remains an open issue.

@erewok thanks for your efforts!

But how raw ASGI middleware may solve the issue?

@dmig-alarstudios see @retnikt's comment above to that effect:

@erewok yes, this issue doesn't happen with call. BTW I think I mentioned that in the OP

IMO this isn't really (just) an issue with BaseHTTPMiddleware, rather with how background tasks sure implemented; this should be fixed by making them either:

  • detached from a Response by using asyncio.create_task or similar, so they actually run in the background
  • or, renamed to something more appropriate to reflect what they're actually for (i.e. for cleaning up after a request is handled)

Yes, it is a broader issue with background tasks but that also implies that something should be done differently with how concurrency is being managed in the app. @retnikt see this comment for further discussion.

Shame for me: never thought about using asyncio.create_task.

But I'm still wondering: running the same testcase under hypercorn solves this issue, but why? Is it a hypercorn bug?

@erewok Thank you for your work on this.

After your suggestion on using raw ASGI middleware, link to AuthenticationMiddleware as example and a bit of code digging, I think I found a way to do what I was intending with raw ASGI middleware.

Basically, for anyone, who is trying to inject something into request and perform some cleanup on that injected instance after request has been finished, you can do this by placing your instance inside scope variable and accessing it through request["your_instance_name"]. This is allowed thanks to scope being MuttableMapping and HttpConnection having __getitem__ method

I only hope that wrapping the call of an app inside middleware inside try-finally doesn't break anything.
Screenshot 2020-08-13 at 10 34 39

Any updates on this issue ?

i have same issue and if i add

await asyncio.sleep(1)
inside any method that is triggered by task, it is working fine. I don't understand how it can work but it is working.
even i put time.sleep(30) after asyncio.sleep, i get result immediately but process finish 30sec later which is correct.

even await asyncio.sleep(0.1) would do the job. the key here is await switching asyncio context.

What about sync-background task functions ? These functions has the same issue.

@dmig-alarstudios @retnikt Can you help me with asyncio.create_task() ? I tried to use it but it did not help, here is my code:
https://stackoverflow.com/questions/66427029/how-to-use-uvicorn-with-asyncio-create-task-to-put-task-in-background

@Kruszylo When working with async you need to use asyncio.sleep instead of time.sleep because otherwise it will block the whole process

@retnikt I used time.sleep on purpose, because my origin project has async/sync functions that run very slow. In real case long_task() is ML training task. I can mock it with for loop:

async def long_task():
    print('Starting long task...')
    global FIFO
    FIFO = [1, *FIFO]
    # mock long calc with long loop
    j = 0
    for i in range(10000000):
        j = j + 1
    FIFO.pop()
    print('Long task finished!')

The API behavior will not change, it will queue requests and wait until for loop will finish.

@Kruszylo you should've used asyncio.sleep() instead. ML task should be, in my opinion, slow but non-blocking.

otherwise, your example is correct

Python is single-threaded even with async because of the GIL. You need to either:

  • run the long code in another process (see the multiprocessing module)
  • run the long code in another thread, e.g. using starlette.concurrency.run_in_threadpool. This will require the GIL to be released by the long-running code which is only possible for modules written in C and will require support from whatever libraries you're using.

In the long run, and if you plan to scale the project at all, you probably need to have the ML computations running in a separate service entirely, communicating over some kind of message broker.

This is not to do with this issue.

I've found a way how to start long computations and do not block API, check my answer.

gnat commented

Passing BackgroundTask to TemplateResponse for async functions has been broken for me the last few releases, so I've been resorting to simply using: asyncio.create_task(email(...))

Perhaps tests/test_background.py should add asyncio.sleep() to emulate work being done if this is breaking?

@tomchristie Any insight on this issue?

I'm not able to reproduce this anymore, is this still a thing?

Yes, I can reproduce it with the latest version 0.70.0.

If I enable the Gzip middleware:

from fastapi.middleware.gzip import GZipMiddleware
app.add_middleware(GZipMiddleware, minimum_size=1000)

then running a BackgroundTasks blocks the responses until the background task finishes. If I remove the middleware the background task works as expected and HTTP responses are returned immediately without waiting for it

Finally found a workaround by exploiting run_in_executor:

from asyncio import get_event_loop
from concurrent.futures import ThreadPoolExecutor
from functools import partial

def run_sync_code(task, *args, **kwargs):
    executor = ThreadPoolExecutor()
    loop = get_event_loop()
    loop.run_in_executor(executor, partial(task, *args, **kwargs))

Then in your favorite FastAPI endpoint:

@app.post("/debug/background_task", tags=["debug"])
async def __debug_background_task():
    def sleep_task(*args, **kwargs):
        print("start sleep....", args, kwargs)
        time.sleep(5)
        print("end sleep....")

    run_sync_code(sleep_task, 1, 2, a="a", b="b")

I got this bug after updating fastapi 0.67.0 -> 0.70.0

This code worked on my project with few midllewares, but after migrating to newer version, "im alive" is not printing in my logs

async def task():
    logger.info("im a task")
    await sleep(1)
    logger.info("im alive")


@app.get("/")
async def root(background_tasks: BackgroundTasks):
    background_tasks.add_task(task)
    return {"status": "ok"}

It happens even i use only one dummy middleware:

class DM(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        return await call_next(request)

After removing all middlewares this code works well as expected

I tried to reproduce this on new pure project, but i could not.

If anyone finds this and is trying to add headers to responses this is how i ended up doing it using the information from the above discussions.

from starlette.types import ASGIApp, Message, Receive, Scope, Send
from starlette.datastructures import MutableHeaders

# Custom class required to add headers in-order to not block background tasks
# https://github.com/encode/starlette/issues/919


class SecurityHeadersMiddleware:
    def __init__(
        self,
        app: ASGIApp,
    ) -> None:
        self.app = app

    async def __call__(self, scope: Scope, receive: Receive, send: Send) -> None:
        async def send_wrapper(message: Message) -> None:
            if message["type"] == "http.response.start":
                headers = MutableHeaders(scope=message)
                headers[
                    "Strict-Transport-Security"
                ] = "max-age=63072000; includeSubDomains; preload"
            await send(message)

        await self.app(scope, receive, send_wrapper)

A related issue is that BackgroundTasks will get cancelled if you have a middleware and the caller sets the header Connection: close.

# Python 3.8.13
import asyncio
import fastapi  # fastapi==0.75.2
import time
import uvicorn  # uvicorn==0.17.6

import logging

logging.basicConfig(format='%(asctime)s.%(msecs)03dZ [%(levelname)s] %(message)s')
log = logging.getLogger()

app = fastapi.FastAPI()


@app.middleware("http")
async def empty_middleware(request: fastapi.Request, call_next):
    response = await call_next(request)
    return response


@app.get("/")
async def the_route(background_tasks: fastapi.BackgroundTasks):
    background_tasks.add_task(a_little_work)
    # background_tasks.add_task(shielded_work) <-- this one completes

    return {'status': 'ok'}


async def a_little_work(prefix='unshielded'):
    start = time.monotonic()
    log.warning(f"{prefix} background task starts!")
    await asyncio.sleep(0.5)
    end = time.monotonic()
    # this statement never gets printed if the Connection header is set to close:
    log.warning(f"{prefix} background task done, took {end-start}s")


async def shielded_work():
    import asyncio
    await asyncio.shield(a_little_work('shielded'))


if __name__ == '__main__':
    uvicorn.run(app, host="0.0.0.0", port=8000)

I discovered same issue when trying to upgrade my app from fastapi 0.68 to 0.70. Interesting fact is that background tasks work fine on my workstation Ubuntu 20.04 but fail on server with CentOS 7 (same code, python 3.9 & modules versions).

It looks that best solution is to deprecate BaseHTTPMiddleware like mentioned in #1678

I propose that this issue be closed - as far as I can tell, the issues reported here are stale, invalid, or duplicates of other issues:

  • The original issue reported by @retnikt seems to be an issue regarding the way that background tasks interfere with the processing of new requests.
    • Despite my best efforts, I have been unable to reproduce the issue, even after installing starlette==0.13.2+uvicorn==0.11.4;
    • In Starlette 0.15.0, the adoption of anyio has completely changed the behavior of BaseHTTPMiddleware regarding that kind of concurrency and cancellation issues, meaning that I suspect that the issue is stale;
  • @erewok issue from July 2020, like the previous report, may also be stale, as it dates from before Starlette 0.15.0 (released in June 2021);
  • @jacopofar's issue with GZipMiddleware cannot have anything to do with BaseHTTPMiddleware as GZipMiddleware is not a subclass of BaseHTTPMiddleware - maybe there is another issue being reported there?
  • @zoola969's issue looks to me like a cancellation issue, similar to #1438;
  • @maksimt's issue is clearly a duplicate of #1438;

Here is how I think we should proceed on this issue:

  • If you have issues around cancellation when a BaseHTTPMiddleware is in use, then you may want to check #1438 (and my proposed fix in #1715);
  • If you have other issues with current versions of Starlette, please open new issues;

The GZipMiddleware issue was related to the BaseHTTPMiddleware, but fixed already. ๐Ÿ™

The other, as you said. ๐Ÿ™Œ