Different async_hooks behavior in Node 10

Question

Different async_hooks behavior in Node 10

Conduitry opened this issue 6 years ago · 28 comments

Version: v10.0.0
Platform: Linux, Windows 10 WSL

Running the following code

const async_hooks = require("async_hooks");
const fs = require("fs");

async_hooks
  .createHook({
    init: (asyncId, type, triggerAsyncId, resource) => {
      fs.writeSync(1, `${triggerAsyncId} => ${asyncId}\n`);
    }
  })
  .enable();

async function main() {
  console.log("hello");
  await null;
  console.log("hello");
}

main();

on Node 9 gives the output

1 => 6
hello
1 => 7
6 => 8
8 => 9
hello
9 => 10

while on Node 10 it gives

1 => 6
hello
1 => 7
1 => 8
hello
1 => 9

That is, on Node 10, the triggerAsyncId is always 1, and I am unable to track which contexts follow from which other contexts.

Is one of these unexpected behavior? If the change is a known (undocumented?) new behavior, is there any way with Node 10 to achieve what I was doing under previous versions?

Answer 1 · 2018-04-25T07:14:51.000Z

I don't really see any changes on our end that could've caused this. I think it might be due to changes in V8.

ping @nodejs/v8 — did anything change about PromiseHooks or Promises that could be causing this? I see there's one less Promise now so it does seem like at least some stuff changed.

/cc @nodejs/async_hooks

Answer 2 · 2018-04-25T07:21:11.000Z

Can someone bisect to verify that this was caused by a V8 upgrade? /cc @bmeurer

Answer 3 · 2018-04-25T07:29:38.000Z

Unfortunately don't have time do git bisect given the time it takes to recompile when V8 changes but as far as I can tell on v9.x all the promises (3 of them) are related (that is, each subsequent one seems to be a child of the previous one). On v10.0.0, they're both (only 2! now) independent.

Oh and neither version seems to trigger before or after hooks in this example.

(For all I know this new behaviour could be more correct...)

Answer 4 · 2018-04-25T08:49:07.000Z

That's on purpose. We optimize away some promise allocations.

Answer 5 · 2018-04-25T09:06:11.000Z

@bmeurer Are you saying the number of promises is intentional or the fact that we're no longer tracking the nested promises as children like we used to? This seems to complicate async hooks behaviour in relation to async/await.

(Although I don't really know if that's true... the new log might be slightly more accurate if less helpful in terms of tracking.)

Answer 6 · 2018-04-25T09:12:03.000Z

@apapirovski The fundamental problem is that there's no formal semantics for PromiseHooks. So we don't even know what's right or wrong. I'm not sure about the relation chain here either, as I don't now what's right and what's wrong.

Answer 7 · 2018-04-27T23:25:35.000Z

@AndreasMadsen any thoughts on this?

Answer 8 · 2018-04-28T06:07:22.000Z

Well, this will certainly need to get addressed somehow... but I don't think we can do it without changes in V8. We don't really have the required context information to patch around this in any way.

Answer 9 · 2018-04-28T06:23:43.000Z

How about specifying requirements and adding comprehensive tests first? The turnaround time for V8 changes to end up in Node is pretty long. Fixing this kind of issues one by one on an ad hoc basis will not work.

Answer 10 · 2018-04-28T11:33:59.000Z

How about we gather a bunch of consumers of async_hooks and promises and work out use cases of what we need to accomplish with them before we recommend any changes?

Answer 11 · 2018-04-28T11:42:02.000Z

Having a set of use cases definitely helps.

@bmeurer, @ofrobots and I have a proposal wrt Promise hooks that we would like to discuss in diagnostic WG. For that, use cases would be great too.

Answer 12 · 2018-04-29T16:31:12.000Z

I think the behavior is not only different. Seems that now async_hooks can't track the context after awaiting promises with async/await.

// example.js
// eslint-disable-next-line node/no-missing-require
const {createHook, triggerAsyncId, executionAsyncId} = require('async_hooks');
const fs = require('fs');

createHook({
  init: (asyncId, type, triggerAsyncId) => {
    printAsyncIds('INIT', asyncId, triggerAsyncId);
  }
}).enable();

async function main() {
  printAsyncIds('BEFORE', triggerAsyncId(), executionAsyncId());
  await sleep(300);
  printAsyncIds('AFTER', triggerAsyncId(), executionAsyncId());
}

main();

function printAsyncIds(label, triggerId, asyncId) {
  fs.writeSync(1, `${label}:\t | TID(${triggerId})\t| EID(${asyncId})\n`);
}

function sleep(msecs) {
  return new Promise(function(resolve) {
    printAsyncIds('PROM', triggerAsyncId(), executionAsyncId());
    setTimeout(() => {
      printAsyncIds('TOUT', triggerAsyncId(), executionAsyncId());
      resolve();
    }, msecs);
  });
}

Output:

$ docker run -v "$PWD/example.js":/example.js:ro node:10 node example.js 
INIT:	 | TID(5)	| EID(1)
BEFORE:	 | TID(0)	| EID(1)
INIT:	 | TID(6)	| EID(1)
PROM:	 | TID(0)	| EID(1)
INIT:	 | TID(7)	| EID(1)
INIT:	 | TID(8)	| EID(1)
INIT:	 | TID(9)	| EID(1)
INIT:	 | TID(10)	| EID(6)
TOUT:	 | TID(1)	| EID(7)
AFTER:	 | TID(0)	| EID(0) <----

$ docker run -v "$PWD/example.js":/example.js:ro node:9 node example.js 
INIT:	 | TID(5)	| EID(1)
BEFORE:	 | TID(0)	| EID(1)
INIT:	 | TID(6)	| EID(1)
PROM:	 | TID(0)	| EID(1)
INIT:	 | TID(7)	| EID(1)
INIT:	 | TID(8)	| EID(1)
INIT:	 | TID(9)	| EID(5)
INIT:	 | TID(10)	| EID(9)
INIT:	 | TID(11)	| EID(6)
TOUT:	 | TID(1)	| EID(7)
AFTER:	 | TID(9)	| EID(10)

NOTE: with plain promises it still works as expected:

function main() {
  printAsyncIds('BEFORE', triggerAsyncId(), executionAsyncId());
  return sleep(300).then(function() {
    printAsyncIds('AFTER', triggerAsyncId(), executionAsyncId());
  });
}

$ docker run -v "$PWD/example.js":/example.js:ro node:10 node example.js 
BEFORE:	 | TID(0)	| EID(1)
INIT:	 | TID(5)	| EID(1)
PROM:	 | TID(0)	| EID(1)
INIT:	 | TID(6)	| EID(1)
INIT:	 | TID(7)	| EID(1)
INIT:	 | TID(8)	| EID(5)
TOUT:	 | TID(1)	| EID(6)
AFTER:	 | TID(5)	| EID(8)

Answer 13 · 2018-05-01T02:11:15.000Z

Confirmed, this breaks any sort of CLS-like system on async_hooks.

const ah = require('async_hooks')
const assert = require('assert')

const cls = {}
const states = new Map()

Object.defineProperty(cls, 'state', {
  get () {
    const asyncId = ah.executionAsyncId()
    return states.has(asyncId) ? states.get(asyncId) : undefined
  },
  set (state) {
    const asyncId = ah.executionAsyncId()
    states.set(asyncId, state)
  }
})

const hook = ah.createHook({
  init (asyncId, type, triggerAsyncId, resource) {
    states.set(asyncId, cls.state)
  },
  destroy (asyncId) {
    if (!states.has(asyncId)) return
    states.delete(asyncId)
  }
})
hook.enable()

cls.state = 'test'

function delay(ms) {
  return new Promise((resolve) => {
    assert(cls.state === 'test', 'state is correct before setTimeout')
    setTimeout(() => {
      assert(cls.state === 'test', 'state is correct before resolve')
      resolve()
      assert(cls.state === 'test', 'state is correct after resolve')
    }, ms)
    assert(cls.state === 'test', 'state is correct after setTimeout')
  })
}

async function main() {
  assert(cls.state === 'test', 'state is correct before await')
  await delay(100)
  // Fails here...
  assert(cls.state === 'test', 'state is correct after await')
}

main().then(
  value => console.log({ value }),
  error => console.error(error)
)

Also, I did a bisect on Node.js and was able to confirm the issue came from the V8 6.6 upgrade.

As a side note: executionAsyncId and triggerAsyncId functions both return 0 at the point where the failure occurs.

Answer 14 · 2018-05-02T09:52:51.000Z

Yeah that's clearly a bug in V8 where we don't remember the outer promise for an async function unless the debugger is active. Patch below fixes that issue. @targos can you float that on Node 10?

diff --git a/deps/v8/src/builtins/builtins-async-gen.cc b/deps/v8/src/builtins/builtins-async-gen.cc
index 8036763c83..073c96a2e0 100644
--- a/deps/v8/src/builtins/builtins-async-gen.cc
+++ b/deps/v8/src/builtins/builtins-async-gen.cc
@@ -44,7 +44,7 @@ void AsyncBuiltinsAssembler::Await(Node* context, Node* generator, Node* value,
   // When debugging, we need to link from the {generator} to the
   // {outer_promise} of the async function/generator.
   Label done(this);
-  GotoIfNot(IsDebugActive(), &done);
+  GotoIfNot(IsPromiseHookEnabledOrDebugIsActive(), &done);
   CallRuntime(Runtime::kSetProperty, native_context, generator,
               LoadRoot(Heap::kgenerator_outer_promise_symbolRootIndex),
               outer_promise, SmiConstant(LanguageMode::kStrict));

Answer 15 · 2018-05-02T12:03:20.000Z

V8 patch landed in v8/v8@ca76392. Any chance it could be merged to 6.7 and 6.6 upstream?

Answer 16 · 2018-05-02T12:14:05.000Z

Upstream fix also landed. Don't think we'll back-merge to 6.6 or 6.7, since this doesn't affect Chrome. @natorion?

Answer 17 · 2018-05-02T12:32:16.000Z

I opened a floating patch for it in #20467.

Answer 18 · 2018-05-02T14:36:07.000Z

I am wondering if we should add a regression test for this as well. I am somewhat surprised that we did not catch this on our side when updating V8. Thoughts?

Answer 19 · 2018-05-02T14:59:40.000Z

A regression test would be nice 👍

Answer 20 · 2018-05-13T15:04:44.000Z

I have a proposal wrt Promise hooks that we would like to discuss in diagnostic WG. For that, use cases would be great too.

@hashseed please invite me to said meeting? I have some ideas I'd like to discuss there about possible use cases for async_hooks and promises.

Answer 21 · 2018-05-13T15:05:16.000Z

Would also love to discuss this in the collab summit during @bmeurer's session about promise performance or some other time (maybe in my session).

Answer 22 · 2018-05-15T06:37:03.000Z

Said meeting happened two weeks ago. Would love to discuss during collab summit though.

Answer 23 · 2018-05-15T11:24:20.000Z

@hashseed my bad and +1 on discussing in the collab summit. Are there any meeting notes I can read to be more prepared?

Answer 24 · 2018-06-06T20:20:53.000Z

It looks like this is fixed in 10.4.0 with the update of V8 from 6.6 to 6.7 🎉

Answer 25 · 2018-06-06T21:33:03.000Z

I think this essentially leaves us with somebody having to write tests for this and then we can close this issue – right?

Answer 26 · 2018-06-06T21:43:35.000Z

The now-obsolete PR to float the V8 fix on 6.6 included a brief test, if that's useful.

Answer 27 · 2018-08-03T18:11:35.000Z

Can we get @BridgeAR's simple test included for at least a sanity test?

Answer 28 · 2018-08-23T07:54:18.000Z

This is fixed in 10.4.0.