async-interop/event-loop

Scoping

kelunik opened this issue · 8 comments

We have quite a few PRs and the discussion is spread accross those PRs instead of being in one issue. I know, adding yet another discussion thread seems kind of pointless then, but I think it's important to gather some clear statements about expectations regarding scoping.

PRs

My own statement

I think a clear scope avoids bugs with different drivers. We allow passing a driver to be run to Loop::execute and should therefore only rely on the driver being accessible with the Loop accessor inside that Loop::execute scope.

It has been mentioned that this would prevent the adapter for React having a correct run method and additionally would prevent first setting things up and then running the loop.

The React adapter currently uses Loop::execute(function () {}, Loop::get()); to run the default / current loop. This continues to work with a strict Loop::execute scope (#139) as long as the whole application is wrapped in Loop::execute. It uses a stacked driver then, but as both are the same, everything will work fine.

Another solution for the adapter is to use an explicit driver first and then simply pass it to Loop::execute as React uses an explicit loop that's passed around anyway.

<?php

$loop = new LoopAdapter($interopLoop);
$connector = new DnsConnector($loop);
// ...
$loop->run(); // Executes Loop::execute

Registering events on a specific loop instance without the accessor will always register those events on the right driver (obviously) and is therefore totally fine outside of Loop::execute.

But after all, this is something which doesn't affect any library code. It only affects tests and application code, which both have to be changed anyway to make use of the accessor and interop loop.

I really want to avoid having resetDriver and a default driver at all if we can have a clear scoping and thus avoid bugs with different drivers and forgot-to-reset drivers in tests. It's going to be less confusing for newcomers if the following code just errors instead of just not working, because it's using two different loops.

<?php

Loop::defer(function () { /* never executed */ });

// Uses a new driver and thus "ignores" the previous `Loop::defer()`
Loop::execute(function () { print "hello"; });

I'm very interested in opinions especially from @WyriHaximus @jsor @cboden @davidwdan @mbonneau

I see #149 is also related.

I proposed something similar to #149 way early on when we started, though not quite as spartan. I think Loop::run() looks slightly nicer than Loop::get()->run(), but the latter form certainly reduces the responsibility of the Loop static accessor.

I think there are two major benefits of the current execute() function. The first is that the lifetime of the loop driver is lexically guaranteed:

$myDriver = new MyDriver();

Loop::execute(function() {
    // Loop::get() is guaranteed to return $myDriver (sans a sub-scope).
}, $myDriver);

// $myDriver is guaranteed to be removed from global scope here.

You can no longer guarantee a controlled lifetime for the global event loop with a pure get()/set().

The second benefit is allowing multiple pieces of code to set the global driver without affecting each other. With a plain set(), the order of execution could become important and unrelated code will affect each other.

I would propose a slightly less plain but similar compromise: a stack-based solution similar to a Lisp special variable. This has the second benefit above that execute() also has:

// Create some specific driver.
$myDriver = new MyDriver();

// Make the driver the active driver.
// Could also call this Loop::push().
Loop::init($myDriver);

// Do some async...
Loop::get()->defer(function() {
    // We can also call other functions which run their own loop.
    doSomethingAndWait();
});

function doSomethingAndWait()
{
    // We can set the active driver without destroying the previous driver.
    Loop::init(new OtherDriver());

    // Do some stuff while OtherDriver is active...

    // Run the loop.
    Loop::get()->run();

    // Clean up after ourselves.
    Loop::drop();
}

// Run the loop.
Loop::get()->run();

// And finally clean up.
// Could also call this Loop::pop().
Loop::drop();

This is a pattern I see in a few libraries that try to reduce the pain of global variables, where every init call should be paired with another cleanup call to manually control the scope of the global value. (Of course should isn't something we can guarantee, but that is the tradeoff we're discussing.)

Note that this issue is redundant if we go with #149. (As this then no longer is this specifications problem to solve.)

If we'd go with #149, it'd be the task of libraries to provide friendly access.
E.g. \Amp\defer($cb) could call Loop::get()->defer($cb) then. And we'd have \Amp\run() which provides proper scoping for you.

Also, your Loop::init()/Loop::drop() is really just sugar for $old = Loop::get(); Loop::set(new OtherDriver); ... Loop::set($old);.

IFF we go with a radical cleanup, I'd go with the simplest common denominator of interop, just a simple getter and setter.

fyi @sagebind back then when you proposed it, we were still retaining the loop accessor functions (e.g. Loop::defer() etc.) … Initially at least I intended the Loop class to be directly consumed by users; and not just libraries at the very lowest level.

@bwoebi It's mostly sugar, but the key is that Loop::init() would not allow breaking the driver stack. It provides at least some guarantee that code cannot set the global driver without preserving the previous driver.

Forgetting to drop() at the end breaks it again (that's actually why Loop::execute() expects a closure so that the scope is well delimited).

At the point where we have just get+set, there will be only a few well-defined functions for each low-level library using Loop::set(). Thus it won't be even worth using init()/drop().

@sagebind described very nicely the benefits of execute(); I really think it would be a shame to move away from that approach. Has there been any discussion happening elsewhere about this issue, as it doesn't seem like this has really moved on in the past three weeks?

I believe we should go with the scoped version using execute(). This is the better way to do things. We shouldn't be burdened by prior practices, especially since it is trivial to wrap code previously written for an un-scoped loop within a callback for a scoped loop.

I'm in favor of retaining a scoped execute() as well. The benefits outweigh the disadvantages we might be concerned about.