python/asyncio

`loop.subprocess_shell() and loop.subprocess_exec() may block

Martiusweb opened this issue · 4 comments

Hi,

Subprocess creation is handled by loop.subprocess_exec() and loop.subprocess_shell(). Both uses subprocess.Popen(), but the later performs at least one blocking read on a pipe (on unix at least), used to detect the outcome of exec() in the new child process (see: https://github.com/python/cpython/blob/master/Lib/subprocess.py#L1538).

This cause the loop to block if the fork/exec takes time in the subprocess, for instance because the system is under high IO pressure, or if preexec_fn is doing a slow operation.

I am not sure of how this issue should be fixed.
There is a quick-and-dirty fix : invoke subprocess.Popen() in an executor thread, but we probably want to address the original issue (the blocking read).
In this later case, can we update subprocess.Popen() or should we try to subclass it in asyncio without modifying cpython?

I can try to work on a patch, but suggestions are more than welcome about where to start (especially my previous question !).

Definitely better to fix Popen by, say, adding a flag for disabling read.

Honestly I doubt if anybody pass preexec_fn to asyncio subprocess calls.
Under high IO pressure the whole OS works slow and event loop is most likely also affected, isn't it?

I use preexec_fn to set resources limits (rlimit) and set the uid/gid/supplementary groups of the new process.

When the system is under IO pressure because of slow disk operations, if the loop doesn't block, I can mitigate the pressure (by telling my clients to wait and retry later for instance). Otherwise, when the process is ready to work again, I have to face a thundering herd of pending requests, which makes things worse. Also, high IO on one device doesn't mean it affects the whole system (for instance, the system disk may be slow, but the data disk may be fine).

Anyway, I feel like this should be fixed, even if this is an uncommon issue.

Ok, but most likely it affects Popen also. Overriding the whole _execute_child scares me.

1st1 commented

People do use 'preexec_fn`, and yes, it blocks. I have the same bug in uvloop, although it doesn't use Python's subprocesses. Once I fix that, you can use uvloop to avoid the problem.

And yes, we should fix asyncio somehow.