python/cpython

deprecate the asyncio child watchers system

graingert opened this issue · 33 comments

Deprecate the child watchers system and the policy system in favour of
asyncio.Runner(loop_factory=asyncio.ProactorEventLoop/asyncio.SelectorEventLoop/uvloop.new_event_loop)
that would be deprecating:

asyncio.get_event_loop() # already deprecated unless the loop is running 
asyncio.set_event_loop()  # asyncio.set_event_loop(None) should probably be exempt
asyncio.get_event_loop_policy()
asyncio.set_event_loop_policy()  # asyncio.set_event_loop_policy(None) should probably be exempt
asyncio.set_child_watcher()
asyncio.get_child_watcher()  # `_make_subprocess_transport` will instead attempt to use os.open_pidfd and fallback to starting a thread

I'd also like to introduce a new API: asyncio.EventLoop implemented as:

if sys.platform == "win32":
    EventLoop = ProactorEventLoop
else:
    EventLoop = SelectorEventLoop

asyncio.new_event_loop() will issue a DeprecationWarning if the current policy is not the default policy, and then in 3 releases become an alias of asyncio.EventLoop

Originally posted by @graingert in #93896 (comment)

Some background to this proposed deprecation:

This is a more comprehensive version of #82772

@serhiy-storchaka proposed deprecating set_event_loop() in #93453 (comment)

But maybe we should first deprecate set_event_loop()? It will be a no-op now.

@asvetlov noted that
get_event_loop_policy().get_event_loop() was not deprecated by oversight in #83710 (comment)

IMHO, asyncio.set_event_loop() and policy.get_event_loop()/policy.set_event_loop() are not deprecated by oversight.

We are in dire need of more asyncio experts. @1st1 this isn't urgent but would be nice to have your perspective in time to do this in 3.12.

For 3.12 IMO we should deprecate MultiLoopWatcher #82504 and others which have race condition and other issue. Once that's done we may deprecate the entire child watcher system but just removing MultiLoopWatcher would be a good start.

We discussed this at the sprint and we agree that there are many things wrong with the child watchers and the policy system.

Deprecating child watchers: @1st1 thinks these should be done per loop (like uvloop does), not globally by the policy. Much more discussion on this topic is already in #82772. Bottom line, we agree to deprecate it, details remain to be seen.

Deprecating policies: Yes please. The policies no longer serve a real purpose. Loops are always per thread, there is no need to have a "current loop" when no loop is currently running. The only thing we still need is a loop factory, so perhaps instead of an API for getting/setting a global "policy", we could have an API for getting/setting a global "loop factory".

I'm fine with the EventLoop alias (it ties up a loose end), but I recommend that the API for creating a new event loop (when not using runners) should be asyncio.new_event_loop(), not asyncio.EventLoop().

We should totally deprecate set_event_loop() (even with None argument). At that point we can make get_event_loop() an alias for get_running_loop() (or the other way around -- I prefer calling get_event_loop() :-).

so perhaps instead of an API for getting/setting a global "policy", we could have an API for getting/setting a global "loop factory".

I disagree, that's the most painful part of the policy system that I'm looking to deprecate here in favor of passing an explicit loop_factory to asyncio.Runner. The behavior of Runner should be to pick the best event loop by default, if people need to change the behavior of Runner they should pass an explicit factory, if they really need to patch this behavior globally for the whole process they can use a monkey patch

that's the most painful part of the policy system

What exactly is most painful? That there's a global default for something? To me the painful thing is that the policy system is over-engineered, you have to create a class that overrides new_event_loop, instantiate it, and call set_event_loop_policy with the instance. That is just classic Java. You shouldn't need to have to create a class, just a function.

I totally agree that most people should use run() or Runner, but I disagree that we should deprecate all other workflows. To me, Runner is just a convenience class.

I discussed this with Yury and he convinced me that we don't need a global loop factory. Instead we should just have a loop_factory=None keyword arg to asyncio.run().

We still have to come up with a way to transition to a world where child watching is per-loop instead of global though.

Good news. @1st1 has a simple refactoring of PidfdChildWatcher that makes it independent from the main loop -- just like ThreadedChildWatcher. Once that is merged (PR is forthcoming) we can also merge @kumaraditya303's PR GH-98024, and then we can start deprecating all other child watcher implementations.

We can then also deprecate set_child_watcher() (both the asyncio function and the policy method) and eventually we can move the child watcher out of the policy. There's some hand-waving here because in theory people could subclass the default policy class and override get_child_watcher() to construct their own child watcher -- we'll have to deprecate that too somehow.

But all this builds a road to a world where policies are no longer needed and eventually no longer exist.

@gvanrossum I have a PR for Pidfdchildwatcher already #94184

It would also be good for the child watchers to be responsible for calling the callback on the event loop thread. Currently the callback needs to defensively call call_soon_threadsafe when it's redundant eg the Pidfdchildwatcher

Eg

self.call_soon_threadsafe(self.call_soon, transp._process_exited, returncode)

I just read the comments in GH-93453: "Make get_event_loop() an alias of get_running_loop()". This makes me want to go slow with the whole "deprecate policies" part. (I am still fine with deprecating watchers ASAP.)

Maybe we could start by deprecating just set_event_loop_policy(), hence making the policy (eventually) just a global singleton that stores some state (in particular the current thread's loop, even if it's not running)?

Let's deprecate child watchers first and then we can think about policy since it will require more discussion.

Maybe we can also deprecate set_child_watcher now?

I'd like to see set_child_watcher and get_child_watcher deprecated with a private _get_child_watcher added temporarily that returns whatever was set until set/get_child_watcher is removed

But we can't guarantee that the private _get_child_watcher is supported as long as set_event_loop_policy can be called. We could of course just check whether it exists on the policy object and call it only if it exists, otherwise call the public API.

Would it be acceptable if set_child_watcher was a no-op that reported a warning? Then the unix event loop could just do its own event handling and never use the policy to get the watcher. (IIUC uvloop already doesn't use the watcher API.)

Would it be acceptable if set_child_watcher was a no-op that reported a warning?

Probably not, I think people would still expect to get whatever was set during the deprecation period. Maybe the asyncio subprocess API could just ignore it like uvloop does

Probably not, I think people would still expect to get whatever was set during the deprecation period. Maybe the asyncio subprocess API could just ignore it like uvloop does

But then wouldn't they also expect the watcher they set to be used?

Who knows, we need to do some searching. A possible approach would be to deprecate get_child_watcher() and set_child_watcher() but keep their implementation the same, but also stop calling get_child_watcher() in _make_subprocess_transport(). (Arguably if we do this, we should simplify the implementation and always set up a ThreadedChildWatcher in _init_watcher().)

Searching for set_child_watcher I found gbulb, a library that integrates the GLib main event loop with asyncio. (I actually found glibcoro first, and its README mentions gbulb.)

The importance of this find is that gbulb defines its own child watcher class that integrates with the GLib event loop. They also have a custom policy that manages this watcher, and (of course) a custom event loop. I have a feeling they really need to use their custom watcher in their event loop, because of how GLib works (although I don't know anything about GLib).

I'm guessing in the long run they can refactor their code to avoid using get/set_child_watcher, but the deprecation might be inconvenient for them. (Then again they override the policy methods so they wouldn't get the deprecation warnings.)

This is the first non-trivial mention of set_event_handler I've found (there are lots of dummies and copies around -- a lot of people somehow emulate asyncio).

There's also an intriguing custom watcher in chaperone but this package appears unmaintained (last commits in 2016). It appears a modified clone of FastChildWatcher.

We need a better plan for this, here's my plan:

  • Make PidfdChildWatcher the default where supported. #98024
  • Deprecate all other child watchers. #98089
  • Deprecate all child watcher configuration methods and functions e.g set_child_watcher and get_child_watcher to warn when used. (uvloop does not uses any of this). #98215
  • asyncio ignores set child watcher and instead always uses PidfdChildWatcher or ThreadedChildWatcher.
  • Custom event loop needs to implement child watching themselves like uvloop rather than relying on child watchers system and policy.
  • In 3.14 all the config methods will be removed.

Are you thinking of doing all of these in 3.12 except the final checkbox? If so we should probably do "asyncio ignores set child watcher and instead always uses PidfdChildWatcher or ThreadedChildWatcher" next, before "Deprecate all child watcher configuration methods and functions ..."

I first want to get agreement on whether we should raise DeprecationWarning when overriding the child watcher or just ignore it. I analyzed this with https://cs.github.com/ and it is used mostly to workaround old asyncio bug where ThreadedChildWatcher wasn't used by default instead some other was used. This isn't an issue since about Python 3.8.

I think the remaining three tasks cannot be done until 3.14.

This is done for 3.12, now onto the policy minefield.

FWIW it seems that Jupyter has a legitimate reason to override the default policy (with another one of the predefined ones), see #93453 (comment).

FWIW it seems that Jupyter has a legitimate reason to override the default policy (with another one of the predefined ones), see #93453 (comment).

The WindowsSelectorEventLoopPolicy shouldn't be needed with tornado 6.2 where it runs a selector in a background thread so the main event loop can be the ProactorEventLoop

We are in dire need of more asyncio experts. @1st1 this isn't urgent but would be nice to have your perspective in time to do this in 3.12.

It seems @1st1 is in favor of deprecating the rest of the policy system: https://twitter.com/1st1/status/1711007413275365590?t=PGAYW5447_2JZdgiIsu6eQ&s=19 can we do this for 3.13?

I am fine with that. But I am not someone with a lot of time for it. In 3.13 all we need is some deprecations. Can you help with that?

@gvanrossum I can definitely help adding deprecations!

I want to cleanup the calls to

def teardownModule():
    asyncio.set_event_loop_policy(None)

this is now possible for tests that just use asyncio.run but it's not yet possible for IsolatedAsyncioTestCase I have a design here https://discuss.python.org/t/support-setting-the-loop-factory-in-isolatedasynciotestcase/36027/1

There's also an intriguing custom watcher in chaperone but this package appears unmaintained (last commits in 2016). It appears a modified clone of FastChildWatcher.

I attempted to bring Chaperone up to 3.11 last year, the reason it uses a custom watcher and the reason I was looking at them again is that anyone wanting to install a SIGCHLD handler under asyncio is going to stumble onto the conflict it produces with the child watchers.

Something designed to run as pid 1 should really be doing zombie reaping for unknown processes and a custom watcher seems to be the only reliable way to install a default behaviour. Prior to 3.8 the default was SafeChildWatcher, which has a conflict, so a custom was required. ThreadedChildWatcher from 3.8 and PidfdChildWatcher from 3.9 seem like they won't conflict, however only the former offers a tracking mechanism via ThreadedChildWatcher._threads[pid] to allow determining if asyncio is going to call os.waitpid for that process. PidfdChildWatcher has no callback lookup so there's nothing to hijack. Double calls to waitpid are likely to be unpleasant.

The proper method of dealing with this (admittedly niche) use case still seems to be a custom child watcher or not using asyncio at all. If I'm interpreting the comments correctly, this will now require an entire policy instead of being able to just override the watcher?

This is complete, the deprecation of policy system can be done in separate issue.