eclipse-zenoh/zenoh-python

Asyncio support

ipsod opened this issue · 7 comments

ipsod commented

Describe the bug

Hi.

I was trying to figure out how to use Zenoh with asyncio, and the only mention I could find was at the bottom of this page: https://github.com/eclipse-zenoh/zenoh-python/tree/master/examples

To reproduce

n/a

System info

n/a

appears to have happened in this commit:
2803859

last versions can be found here:
https://github.com/eclipse-zenoh/zenoh-python/tree/8cdb7ee57584a11f43a87f88d5e5e8c86cba111a/examples

oddly, the base rust library is wholly async, but the python bindings are sync. we appear to have had async some time in the past, where that text is leftover from. however, I can't find any of the async calls from those old examples grepping the code, so it seems likely those APIs are gone.

kind of a mood killer for the Next Big Thing in Internet Computing..

Hi,

Since the major API reworks of Zenoh 0.6, we had decided to drop support for the async API in Python. There are a few reasons for this, which kind of amplify each other:

  • Maintaining both a synchronous and asynchronous API is a serious maintenance burden. Everything has to be done twice, sometimes in largely different fashions due to the very different execution models.
  • Of the core members of the Zenoh team, no one is confident in their understanding of asyncio's event loop, nor of how nicely/horribly it may play with Rust's polling futures.
  • To reduce the minimum overhead, Zenoh calls callbacks from its own thread pool. Python's GIL is already a big burden on this execution model, as GIL pre-emption doesn't seem to apply with pyo3. This turns long Python callbacks into a bottleneck for the whole application, and I'm kinda worried (but admittedly don't have proof) that coroutines would make things worse.

If you're working with asyncio, I'd suggest piping your samples through some sort of asynchronous channel, and having a task read the samples from that channel asynchronously. Doing so will have the added bonus of guaranteeing that your callbacks stay short, which will let Zenoh perform to its best.

As a side note, I'm retagging this issue:

  • Remove bug: the actual bug was in documentation, by still mentioning asyncio which we have intentionally dropped support for.
  • Add enhancement: Provided we find sufficient bandwidth within the team to properly study it, we'll probably come back to how to support asyncio to a standard that warrants the additional maintenance burden (since our justifications for removing it was "we're doing much more work for something we're not convinced we're doing properly").
  • Add wontfix: The team is busy with many new features for the whole Zenoh ecosystem, so I wouldn't expect that asyncio support to come soon.
ipsod commented

Regarding the dangers of a Python script failing to cleanup a Zenoh subscriber...

I sometimes write code so bad that it cannot be exited normally, and the process must be killed. What should I do to clean up after such an event?

This would be better suited for issue #62, but I'll reply in context: should your process exit abnormally, the link(s) to neighbour(s) will eventually time out (10s by default), at which point all declarations made by the dead session will be undeclared.

Note that this will only happen once the session has become unable to contact its neighbours. If you may declarations in a process that does keep its session alive and well, the infrastructure will assume the session still needs all of the declarations that haven't been undeclared.