django/asgiref

Using asgiref.local.Local with gevent leads to memory leaks

reupen opened this issue · 7 comments

We use Django with Gunicorn and gevent workers. Since updating to Django 3, we started seeing some reasonably severe memory leaks in our web processes.

I bisected one of the leaks I could reproduce locally to django/django@a415ce7.

In particular, it seems to be caused by the switch from threading.local to asgiref.local.Local.

I investigated a fair bit. There seem to be a few things going on:

(Also, I'm not sure how gevent-friendly CLEANUP_INTERVAL = 60 is.)

It could be argued that gevent-related bugs are a problem with gevent. However, I think this problem is quite severe and also not easily noticed as it's mostly a silent problem.

While any fixes (here or in gevent) would be appreciated, it would be good to document the incompatibility too.

(I tested gevent 1.4.0 and 1.5a3.)

Thanks

Yeah, gevent does a lot of weird things that I don't know we'll be able to plug entirely.

Are you letting gevent monkeypatch the threading module? If you disable that, it might help with all the checks that are failing. If that is it, I don't see an issue sticking an explicit warning in there if that's what happens; not sure I want to teach Local about greenlets, though...

Are you letting gevent monkeypatch the threading module?

Yes, Gunicorn does that when using gevent workers: https://github.com/benoitc/gunicorn/blob/7d8c92f48a6d9cee6b15fbdade6981721182d073/gunicorn/workers/ggevent.py#L38

That'll be the problem then. Hard to reason about threads when they're... not threads.

I'll need to think about the best approach to this one. I really don't want to have asgiref depend on gevent at all, but detecting that threads aren't threads is going to be tricky otherwise.

@jamadden – any input or thoughts about this?

OK, this might well be solved by the new patch for Local that removes directly inspecting the class type of threads (https://github.com/django/asgiref/tree/new_local)

If someone wants to try installing the new_local branch and seeing how it fares, that would be great.

I've given it a quick test locally – it does indeed seem to have fixed the problem. Thanks!

Alright, I'll get it merged in then, as it seems to solve at least two problems and not cause any issues!