tc39/proposal-async-context

Specifying `unhandledrejection` behavior

jridgewell opened this issue · 14 comments

Re: nodejs/node#46262

What do we imagine is the correct context when an unhandledrejection is triggered? Given:

const ctx = new AsyncContext();

ctx.run("reg", () => {
  addEventLitener('unhandledrejection', () => {
    // ???
    console.log(ctx.getStore());
  });
})

let reject;
ctx.run("init", () => {
  new Promise((_, rej) => {
    reject = rej;
  });
});

ctx.run("call", () => reject('boom'));

Our options include:

  1. reg, which is the context at the time of the unhandledrejection's handler's registration
  2. init, which is the context at the time of the promise's creation
  3. call, which is the context at the time of the promise's rejection
  4. undefined, which is the context at the time unhandledrejection handler is called

Also relevant, the call-time choice allows does not prevent you from using init-time behavior, you just need to wrap your reject:

let reject;
ctx.run(123, () => {
  new Promise((_, rej) => {
    reject = AsyncContext.wrap(rej);
  });
});
ctx.run(321, () => reject('boom'));

From nodejs/node#46262 (comment)

(Bloomberg definitely has an analogous internal case where we currently depend on restoring the AsyncLocalStorage from kInit in unhandledRejection, but this is a code path based on our custom V8 embedding, not Node.js. Probably our usage pattern is not representative of a typical Node app, and maybe we could adopt this AsyncResource pattern; I'm not sure.)

To elaborate slightly here: In our own weird environment, we actually just store the context where the Promise is created, since that's enough in practice for the limited usage pattern. We use a home-grown AsyncContext analogue to track what is the "current application" out of multiple things that are all running in the same V8 isolate, as described in this talk.

At Bloomberg, for our use-case, which is to track the logical one-of-many application responsible for causing an unhandled promise rejection, the init and call options would result in the same answer. We currently do init, but with the unusual property of propagating the context from parent promises.

At all places where we can have async continuations (Promise, setTimeout, etc) we restore the context (simply which application is running) that was active when we ran the initial main. We don’t (in general) allow applications to call into other applications, so there is no cross contamination possible. The application which created a promise will be the one which rejects it.

Our promise rejection handling is done in C++ and we pull out the application and terminate it.

If we were to allow applications to directly call other applications and pass promises between them, I’m not sure either of the choices given here would be correct. A promise is usually unhandled because someone forgot to handle the possibility of it rejecting. If we imagine application 1 created a promise, had an API to return it to application 2, and then at some point application 1 rejected it, it really is the fault of application 2 for not registering a rejection handler, so application 2 is who we should penalize and terminate. In all likelihood application 2 would have resolution handlers though, which would create additional promises, meaning init is what we would want (without our parent promise propagation).

In all likelihood application 2 would have resolution handlers though, which would create additional promises, meaning init is what we would want (without our parent promise propagation).

I think this means that you just need the .then() registration time behavior, right? The .then() handle would guarantee that the promise App 2 chains from App 1's promise will use App 2's context.

The interesting case is only where the reject function is leaked (specifically, only the reject seen in new Promise((_, reject) => …)). If that crosses between App 1 and 2, then we have the interesting case.

Yeah, just the .then() registration time behavior. I don't think reject leaking is any different. In our use case it doesn't matter who caused the rejection, what matters is who should have handled it but didn't.

During today's meeting, there were no objections to using call the reject's call time context for unhandledRejection (so, call in the OP).

However, it was brought up that this would diverge from node's implementation, which uses initialization time (init in OP). Based on discussions with node folks, we believe they're ok changing to call time. But we don't know if they're accepting it because they believe that's how we get the best performance, or because they actually believe that call time is the correct behavior.

It's actually possible achieve the same performance with both call time and init time. Call time is simpler to specify (and I think implement), but both should be workable. If we believe that init time is the better behavior, we can do that. @legendecas and I will attend a node working group to discuss.

A bit off topic maybe as not only related to unhandled rejection.
But isn't there already a difference compared to current node.js behavior because this proposal picks up context for propagation in case of promises at .then() time instead of promise creation?

But isn't there already a difference compared to current node.js behavior because this proposal picks up context for propagation in case of promises at .then() time instead of promise creation?

I don't think the difference is observable in userland (except in unhandledRejection) so it's effectively the same.

If we decide to use init time for unhandledRejection, we might switch to capturing during the promise creation instead of during .then().

The difference in the normal case is observable if the promise creation happens at a place where context A is active and then() is called when B is active. I agree that's not that common.
I have seen cases where a promise is created during async creation of some shared resource during application startup (no/root context active). This promise is later awaited at several places during processing some requests where different contexts are active.

The difference in the normal case is observable if the promise creation happens at a place where context A is active and then() is called when B is active. I agree that's not that common.

const { AsyncLocalStorage } = require('node:async_hooks');

const als = new AsyncLocalStorage();

const p = als.run('A', () => Promise.resolve());
als.run('B', () => {
  const p2 = p.then(() => {
    // Logs B
    console.log(als.getStore());
  });
});

ALS logs B in this case. It seems like it's not the parent promise's captured context, but the child promise's (which is created by .then() in B context).

Ah ok. So it seems what nodejs/diagnostics#389 describes is actually no longer true.

Yah, it seems going back to at least node 12 this is the case.

I don't think either of init or call make sense given that other callback APIs use registration time:

const variable = new AsyncContext.Variable();

// All should print REG
variable.run("REG", () => {
  queueMictrotask(() => console.log(variable.get()));
});
variable.run("REG", () => {
  promise.then(() => console.log(variable.get()));
});
variable.run("REG", () => {
  setTimeout(() => console.log(variable.get()), 1000);
});

// Doesn't use registration time, what?
variable.run("REG", () => {
  eventTarget.addEventListener("some-event", () => {
     console.log(variable.get());
  });
});

Like I see the use for unhandledrejection events specifically, but there are a lot of events like complete that from a user's point of view should be no different than a .then on a promise. (See also #22)

What would make more sense is if unhandledrejection events just had a property that has a snapshot caught at the time the promise rejected:

globalThis.addEventListener("unhandledrejection", (event) => {
    event.rejectionSnapshot.run(() => {
        // Find out the state of async variables at reject time
    });
});

The difference I see between unhandledrejection and regular event listeners is that I only have one unahndledrejection listener that's registered at the start of my program. Requiring a listener for each, eg, HTTP request is a non-starter, because it'll mean my A handler will also be called for rejections that happen during the B request.

const requestCtx = new AsyncContext.Variable();
export default onFetch(req: Request): Response {
  const id = new Uuid();
  return requestCtx.run(id, handler, req);
}

function handler() {
  // 2 unhandledrejection will be registered before any rejection events are fired.
  addEventListener('unhandledrejection', ({ reason }) => {
    const id = requestCtx.get();

    // This will error out!
    assert.equal(id, reason);
  });

  
  Promise.reject(requestCtx.get());
}

// 2 requests fired in parallel
onFetch(new Request());
onFetch(new Request());

The assert.equal(id, reason) will error out, because 2 unhandledrejection are currently registered within different registration contexts. The A rejection will call both A and B handlers, and the B rejection will do the same. And this isn't solvable with removeEventListener (when would you even do that?!), because we need the handler to exist for the entire context lifetime of each request.

What would make more sense is if unhandledrejection events just had a property that has a snapshot caught at the time the promise rejected:

@Qard suggested something like this in our recent meeting.

Qard commented

Generally what async_hooks (and therefore also AsyncLocalStorage) has done is to follow innermost logical path as it ensures data availability even within internals logically following from the point where it was stored. When following that path a bind(...) can be done on the store to draw an edge on the graph around that internal behaviour, but if that edge is the default the data availability isn't there to draw the internal edge manually.

In the case of promises, it would make the most sense that it follows the point of the resolve/reject call. Currently Node.js actually follows the init which mostly works fine but it does have the quirk of hiding behaviour internal to the promise executor and behaving strangely if you extract the resolve/reject functions and use them elsewhere. If it's bound to the call location then the other two points of init or continuation can be bound around that part of the graph with the bind(...) function of the storage.

The distinction between init and call location is a bit confusing and ambiguous with raw promises, but with async/await it becomes somewhat clearer as the context branches from the current function into the execution of a promise-returning function and then that graph merges back at the point of resolving. In doing so it allows new storage states to be set during an await and accessible after that await resolves. Now in some cases it may be desirable to restore the state from before the await which again can be done with something like bind(...).

The core thinking that is required to make this maximally useful I believe is to approach it from the perspective of tracing the lowest-level graph and providing the tools to conditionally draw edges around things when certain internals are not relevant or should not propagate data through. This behaviour of grafting new edges onto the graph adds a lot of flexibility but also different users can have different opinions of how their store should propagate around things. This is why I had previously suggested having not just a static bind(...) which applies to all stores but also to have an instance bind(...) to allow specific stores to do their own manual grafting where necessary.

Side note: I'm planning on eventually proposing some subset of diagnostics_channel to TC39 which includes a mechanism for using a channel to communicate an optional storage binding decision through the use of channel.bindStore(...) and channel.runStores(...). Those parts of diagnostics_channel are how Node.js currently aims to support APMs controlling where and when to manipulate their own store contexts. If we can have AsyncContext with instance-scoped bind I can wire it up to that and cover any cases APMs need, so long as the default is to follow the inner-most logical execution path to propagate through.