Proposal for reworking promise integration into async_hooks
AndreasMadsen opened this issue · 11 comments
Background
The current implementation of promise integration into async_hooks
is.
- For each new
[[Promise]]
object, call theinit
hook with aPromiseWrap
referencing that Promise. - At
[[resolve]]
or[[reject]]
call theresolve
hook. - For each
[[then]]
callback, call thebefore
hook before calling the callback. - For each
[[then]]
callback, call theafter
hook after calling the callback. - When the Promise object is garbage collected call the destroy hook.
I think this implementation is fundamentally wrong, as it intertwines the promise lifecycle with async_hooks
. This causes a number of issues that are currently blocking us from making async_hooks
stable.
- Performance issues caused by listening for the garbage collection event.
- Thenables are not tracked when used by a native function that creates a microtrask.
destroy
hook is not called ifasync_hook
is enabled after Promise creation.- Tracking the async boundary when multiple
.then()
calls are used on the same promise object, is not possible.
Proposal
My proposal is to rework the promise integration into async_hooks
such that the async barrier is around the [[then]]
call, not creating a new Promise
object.
- at the call of
[[then]]
on apromise
orthenable
create a resource object (or use the promise/thenable object created by[[then]]
) then call theinit
hook. - at the call of the
[[then]]
callback, thebefore
hook is called. - at the end of the
[[then]]
callback, call theafter
hook, immediately after call thedestroy
hook.
How it solves the above-mentioned issues
Performance issues caused by listening for the garbage collection event.
Because the before
and after
hooks are only called once per resource, the destroy
hook can be called immediately after the after
hook. Thus completely eliminating the need to track promise objects in the garbage collector.
Thenables are not tracked when used by a native function that creates a microtrask.
This can now be solved because we don't need to know when the object was created or destroyed. The only knowledge that is required is when the [[then]]
method of the thenable is called by the native JS APIs which invokes the microtask queue. That is actually doable, as we could hook into those APIs. Only manual calls to [[then]]
on a thenable will not be tracked. But I don't see that as a concern, because that is not an async action.
destroy
hook is not called if async_hook
is enabled after Promise creation.
Again, because the destroy
hook is called with the after
hook, calling the destroy
hook becomes trivial.
Tracking the async boundary when multiple .then()
calls are used on the same promise object, is not possible.
This directly confronts this issue, by making the async boundary the [[then]]
call.
Tracking the lifetime of promises
This proposal removes features from async_hooks
that provide insight into promises
. That information is still valuable.
To keep providing that information, I propose making a dedicated promise_hooks
module. A user can then connect the promise lifecycle with async_hooks
via a promiseId
that is exposed both in promise_hook
and via the resource in async_hooks
.
Compatability
This does change the default async chain, as the start point of the async-barrier is now the .then()
call and not the new Promise
call. However, I think this is actually a preferred default. For example, lazyloading a resource with new Promise
, would currently not be tracked correctly, but with the proposed change it will.
Even though this changes the async graph, the proposed async graph will still be valid and useful in most cases. And the existing async graph can be restored by integrating the proposed promise_hooks
module.
implementing this proposal should be part of a major release.
Open questions
- It is unclear to me how this would integrate with
async/await
. As I don't have a good grasp of how the internal[[then]]
calls are used and wrapped. It has been proposed that we supply our own customMicrotaskQueue
(#376 (comment)) to solve this. I think this could also be a good approach to hook into the native promise APIs, which will be necessary to track[[then]]
calls on thenables.
The [[then]]
method of a thenable will be called within a microtask. That microtask would need an async context so the init for the [[then]]
would have a executionAsyncId
and triggerAsyncId
to connect to which it currently does not have. It may be possible to use a subclassed MicrotaskQueue
, wrapping the EnqueueMicrotask(...)
method to inject additional microtasks before and after the given one to set and clear the context, but that might be expensive.
Also, I think this is actually slightly overthinking and/or looking past where the real issue is. We don't actually need promise hooks at all. What we need is microtask hooks.
When it comes down to it, promises are actually conceptually similar to event emitters in that they are only async because of external forces making them so. In this case, the microtask queue. The [[then]]
is the same as emitter.on(...)
and the resolve(...)
and reject(...)
are the same as emitter.emit(...)
with the only difference being that the resolve and reject internally have a microtask queue call. It conceptually looks something like this:
function resolve(value) {
for (const realResolve of resolversForChainedPromises) {
queueMicrotask(() => {
realResolve(value)
})
}
}
If instead of wrapping promises we wrapped microtasks at the point they are created and the points before and after they are executed, it would solve the same problem and would actually be a much simpler solution.
I think that somebody should try to implement a MicrotaskQueue
subclass. V8 lets us customize this easily in its existing API so let's do it!
I think @Qard makes good points. Hooking the queue seems more effective to me, too, and gets around the problem that there are many Promise implementations, but only one microtask queue. Its more robust to hook the locations of true asyncness then all the code above that might use them.
If instead of wrapping promises we wrapped microtasks at the point they are created and the points before and after they are executed, it would solve the same problem and would actually be a much simpler solution.
Looks promising and for sure worth a more detailed look/experiment!
I'm curious what else we may uncoer at the microtask level...
I can remember that v8 team talked about "hidden" promises at some diag summit and that they are not signaled via promise hooks for performance reason (they are anyway not visible/reachable to end user). Could be that we see them...
When it comes down to it, promises are actually conceptually similar to event emitters
Interesting analogy. This raises the question which context is the "correct" to propagate for an ALS system, that one where .on
/.then
is called or .emit
/.resolve
?
I don't think there's really a "correct" path. As Microsoft folks brought up in the previous async context formalization discussions, there's really two relevant paths: user-intent and technical causality. The user intent is generally what is reflected by capturing executionAsyncId
in the init event, while technical causality is reflected by triggerAsyncId
. These two branches will later converge in a callback of some form. However, the causality branch will have additional data between the call and callback in the graph while the user intent branch will connect directly, therefore complete context coverage can be attained at all points by always following the causal path.
I tried the MicrotaskQueue subclass idea. Doesn't work out quite how I hoped. The specific EnqueueMicrotask
variant we need to capture thenables is not virtual and the default constructor is private. It's also an opaque type which lacks the sizing information to be subclassed without some major changes to V8. Because the class exposed in v8.h is actually just a super-class of the internal MicrotaskQueue, there's not much we can work with for the custom MicrotaskQueue idea.
I'm refocusing my effort on seeing if I can just produce a new API which roughly parallels the PromiseHook API in functionality, but focuses specifically on microtask creation and lifecycle rather than promises.
Here is a small note on the performance of the current PromiseHook integration: nodejs/node#34493 (comment)
It seems to me that what we have now is quite expensive and low overhead should be one of the main requirements for the new implementation.
@puzpuzpuz Well, that is really one of the primary benefits of this approach.
This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made.
This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made.