tc39/proposal-top-level-await

Improving micro-task turn consistency.

kmiller68 opened this issue · 4 comments

While implementing top level await in JSC I noticed there is some oddness around when micro-tasks run during async module loading. The current behavior is a bit weird since semi-independent subgraphs of modules may interleave their micro-tasks. For example:

// a.js
await ...;
Promise.resolve().then(() => print("A"));

// b.js
import "./a.js";
print("b");
Promise.resolve().then(() => print("B"));

// c.js
import "./a.js";
print("c");
Promise.resolve().then(() => print("C"));

// d.js
import "./b.js";
import "./c.js";

will output “AbcBC" when importing d.js, whereas I would have expected the output to be "AbBcC". In this graph b.js and c.js don’t have any dependency relation but c.js’s body and micro-tasks are separated by b.js's micro-tasks. This really only matters if c.js's body and micro-tasks rely on some global state persisting across them, which is likely to happen in practice.

The place where I expect the current micro-task queuing to be a pain point for developers is when a new dependency gets added and a failure appears in an unrelated module that is neither a dependency or depender(?) of the new one. Using the previous example, if b.js were instead newly added to the graph, it could cause a new failure in c.js's micro-task as b.js's micro-task is now injected between c.js's body and micro-task.

The current behavior arose from misunderstandings/miscommunications in a WhatWG issue, which I believe has since been resolved. One way I think we could improve the situation is to drain micro-tasks after each yield/return from an async module body. Note, I'm not proposing we change the sync subgraph behavior (it's probably not web-compatible anyway). Aside: Pre-TLA module evaluation was roughly equivalent to do a post-order traversal of the graph and running them all in sequence (with different scopes and whatnot). I think this also has the upside of potentially simplifying the spec.

With this change (and assuming we take #158), I imagine the module evaluation part of the spec looking something like (In JS pseudocode):

// Module evaluation starts here with TLA by calling this with the root module record. 
// If the promise returned here is rejected that means something in the graph failed. 
// The error in the rejected promise should come from the failing module.
async function asyncEvaluate(module) {
    if (module.[[Evaluated]])
        return;
    module.[[Evaluated]] = true;    

    if (!module.[[Async]] && !IsTransitiveDependencyAsync(module))
        return evaluate(module);

    let dependencies = []; // needed for non-blocking evaluation.
    for (let m in GetModuleRecords(module.[[RequestedModules]])) {
        let dependency = evaluate(m);
        if (dependency)
            dependencies[dependencies.length] = dependency;
        await 0; // turn the micro-task queue to drain tasks before starting the next dependency.
    }

    await InternalPromiseAll(dependencies); // Same as Promise.all but without the for-of because that's observable.

    var awaitedValue;
    while (IsStillAsyncEvaluating(module))
        awaitedValue = await EvaluateOrResumeAsyncModuleBody(module, awaitedValue);
}

// This is equivalent to the pre-TLA module body evaluation.
function evaluate(module) {
    if (module.[[Evaluated]])
        return;
    module.[[Evaluated]] = true;    

    ASSERT(!m.[[Async]]);

    for (let m in GetModuleRecords(module.[[RequestedModules]]))
        evaluate(m);

    EvaluateModuleBody(module); // throws if module body execution throws
}

Hopefully, this code makes some sense. There are obvious optimizations that could be done here to reduce spec size when not in isolation but I tried to keep this simple. What are people's thoughts?

I'm thinking about bringing this up at the meeting in a few days since I've heard at least one implementation plans on shipping soon.

I am out the next couple of days (Germany/Mozilla has holidays Fri-Mon) but I will review this before Tuesday.

In the meantime--(I only read this briefly) would it be possible to see a "diff" from the existing behavior? It is simplified enough that I might miss the actual changes. Right now it looks like the turn of the microtask queue?

Note, I'm not proposing we change the sync subgraph behavior (it's probably not web-compatible anyway)

As far as I can tell, the proposed semantic here would be a change of the sync subgraph behaviour, since d, c, and b would no longer execute sync together if they are running microtasks between them, by definition of what it means to be sync.

Using the previous example, if b.js were instead newly added to the graph, it could cause a new failure in c.js's micro-task as b.js's micro-task is now injected between c.js's body and micro-task.

It's worth noting here that if "a.js" in the example is sync and not async, this same issue applies to modules in JS today for the same example without TLA. So my question would be what aspect of TLA do you feel makes this situation different for the modules b, c and d here?

One of the design goals was to ensure that sync parent subgraph remains sync, which is what is happening here. If there is no TLA and a is just sync, the output would be "bcABC". Changing "a" to async gives the "AbcBC" ordering under TLA you mention. This retains the invariant that the sync parent subgraph ordering of DBC remains the same under TLA as before TLA, including the ordering of their microtasks. Only async modules in the graph get new exec orderings. The idea being that changing a deep dependency from sync to async should not otherwise change parent execution ordering in comparison to what is implemented today. If B or C had an await though, then there would be task queue clearing as they would be treated as async modules at that point.

Let me know if I'm missing anything, and thank you for looking into these semantics questions.

Note, I'm not proposing we change the sync subgraph behavior (it's probably not web-compatible anyway)

As far as I can tell, the proposed semantic here would be a change of the sync subgraph behaviour, since d, c, and b would no longer execute sync together if they are running microtasks between them, by definition of what it means to be sync.

Perhaps there is a terminology confusion. What I meant by sync subgraph is a subgraph where a module and all transitive dependencies of that module do not contain a TLA. I was thinking more of an extension of the typical definition of a subtree extended to graphs (i.e. pick a node then make that the root and remove anything not reachable). For what it's worth d, c, and b are not running sync together in the current proposal since there is a microtask turn between c and d. Although, that may be different with #159 but the wording in the spec makes it hard to track ("sorted in the order they had their [[AsyncEvaluating]] fields set to true in InnerModuleEvaluation" makes tracking that an exercise to reader).

Somewhat unrelated but is there a reason why the spec is written as an loop rather than recursion? I have a suspicion the spec could be a fair bit simpler and easier to follow that way. Although, perhaps the non-blocking part makes recursion difficult?

Using the previous example, if b.js were instead newly added to the graph, it could cause a new failure in c.js's micro-task as b.js's micro-task is now injected between c.js's body and micro-task.

It's worth noting here that if "a.js" in the example is sync and not async, this same issue applies to modules in JS today for the same example without TLA. So my question would be what aspect of TLA do you feel makes this situation different for the modules b, c and d here?

One of the design goals was to ensure that sync parent subgraph remains sync, which is what is happening here. If there is no TLA and a is just sync, the output would be "bcABC". Changing "a" to async gives the "AbcBC" ordering under TLA you mention. This retains the invariant that the sync parent subgraph ordering of DBC remains the same under TLA as before TLA, including the ordering of their microtasks. Only async modules in the graph get new exec orderings. The idea being that changing a deep dependency from sync to async should not otherwise change parent execution ordering in comparison to what is implemented today. If B or C had an await though, then there would be task queue clearing as they would be treated as async modules at that point.

I see, this was something that I missed. Perhaps it might be worth extending the ordering section of the readme go a bit more into that. That section seems to be more about what I was referring to as a sync subgraph before.

One downside of this is it does create a bit of a refactoring hazard when a dependency is already async. By changing a module to be async it now inserts a microtask turn where one didn't exist before. That said, that same hazard happens when introducing async for the first time in my proposal as well.

Overall, assuming we're correct that with #159 does what you were saying, I think that's a reasonable design.

Perhaps there is a terminology confusion.

There are a few terminology gaps we could certainly fill a bit better here agreed.

What I meant by sync subgraph is a subgraph where a module and all transitive dependencies of that module do not contain a TLA.

Yes, we also extend the meaning of [[AsyncEvaluating]] to have this definition of asynchronousness as well, yet still consider sync subgraphs in discussion to apply to any isolated subgraph that contains modules whose module bodies are sync, regardless of dependencies.

For what it's worth d, c, and b are not running sync together in the current proposal since there is a microtask turn between c and d

There isn't supposed to be a microtask turn between c and d - if you add a print("d") the output should be "AbcdBC".

#159 is separate to this work and is an ordering concern difference between the current spec and the TLA spec (by the same principle that adding async modules to the graph should retain the ordering invariants we have already). I believe that issue applies equally to the algorithm you outlined and the current algorithm, with the example case as per #158 (comment).

but the wording in the spec makes it hard to track ("sorted in the order they had their [[AsyncEvaluating]] fields set to true in InnerModuleEvaluation" makes tracking that an exercise to reader).

We originally used a global counter here, but it felt odd to have a sense of globalish state. Open to better suggestions here.

Although, perhaps the non-blocking part makes recursion difficult?

The algorithm is recursive as far as I'm aware. Can you clarify - are you referring to the loop implemented in 158?

By changing a module to be async it now inserts a microtask turn where one didn't exist before. That said, that same hazard happens when introducing async for the first time in my proposal as well.

Yes, if anything I would have liked to avoid that tick as well instead of introducing it, but I'm not sure what factors might be in play there.

Thanks again for the detailed feedback.