nodejs/node

Different async_hooks behavior in Node 10

Conduitry opened this issue ยท 28 comments

  • Version: v10.0.0
  • Platform: Linux, Windows 10 WSL

Running the following code

const async_hooks = require("async_hooks");
const fs = require("fs");

async_hooks
  .createHook({
    init: (asyncId, type, triggerAsyncId, resource) => {
      fs.writeSync(1, `${triggerAsyncId} => ${asyncId}\n`);
    }
  })
  .enable();

async function main() {
  console.log("hello");
  await null;
  console.log("hello");
}

main();

on Node 9 gives the output

1 => 6
hello
1 => 7
6 => 8
8 => 9
hello
9 => 10

while on Node 10 it gives

1 => 6
hello
1 => 7
1 => 8
hello
1 => 9

That is, on Node 10, the triggerAsyncId is always 1, and I am unable to track which contexts follow from which other contexts.

Is one of these unexpected behavior? If the change is a known (undocumented?) new behavior, is there any way with Node 10 to achieve what I was doing under previous versions?

I don't really see any changes on our end that could've caused this. I think it might be due to changes in V8.

ping @nodejs/v8 โ€” did anything change about PromiseHooks or Promises that could be causing this? I see there's one less Promise now so it does seem like at least some stuff changed.

/cc @nodejs/async_hooks

Can someone bisect to verify that this was caused by a V8 upgrade? /cc @bmeurer

Unfortunately don't have time do git bisect given the time it takes to recompile when V8 changes but as far as I can tell on v9.x all the promises (3 of them) are related (that is, each subsequent one seems to be a child of the previous one). On v10.0.0, they're both (only 2! now) independent.

Oh and neither version seems to trigger before or after hooks in this example.

(For all I know this new behaviour could be more correct...)

That's on purpose. We optimize away some promise allocations.

@bmeurer Are you saying the number of promises is intentional or the fact that we're no longer tracking the nested promises as children like we used to? This seems to complicate async hooks behaviour in relation to async/await.

(Although I don't really know if that's true... the new log might be slightly more accurate if less helpful in terms of tracking.)

@apapirovski The fundamental problem is that there's no formal semantics for PromiseHooks. So we don't even know what's right or wrong. I'm not sure about the relation chain here either, as I don't now what's right and what's wrong.

@AndreasMadsen any thoughts on this?

Well, this will certainly need to get addressed somehow... but I don't think we can do it without changes in V8. We don't really have the required context information to patch around this in any way.

How about specifying requirements and adding comprehensive tests first? The turnaround time for V8 changes to end up in Node is pretty long. Fixing this kind of issues one by one on an ad hoc basis will not work.

How about we gather a bunch of consumers of async_hooks and promises and work out use cases of what we need to accomplish with them before we recommend any changes?

Having a set of use cases definitely helps.

@bmeurer, @ofrobots and I have a proposal wrt Promise hooks that we would like to discuss in diagnostic WG. For that, use cases would be great too.

I think the behavior is not only different. Seems that now async_hooks can't track the context after awaiting promises with async/await.

// example.js
// eslint-disable-next-line node/no-missing-require
const {createHook, triggerAsyncId, executionAsyncId} = require('async_hooks');
const fs = require('fs');

createHook({
  init: (asyncId, type, triggerAsyncId) => {
    printAsyncIds('INIT', asyncId, triggerAsyncId);
  }
}).enable();

async function main() {
  printAsyncIds('BEFORE', triggerAsyncId(), executionAsyncId());
  await sleep(300);
  printAsyncIds('AFTER', triggerAsyncId(), executionAsyncId());
}

main();

function printAsyncIds(label, triggerId, asyncId) {
  fs.writeSync(1, `${label}:\t | TID(${triggerId})\t| EID(${asyncId})\n`);
}

function sleep(msecs) {
  return new Promise(function(resolve) {
    printAsyncIds('PROM', triggerAsyncId(), executionAsyncId());
    setTimeout(() => {
      printAsyncIds('TOUT', triggerAsyncId(), executionAsyncId());
      resolve();
    }, msecs);
  });
}

Output:

$ docker run -v "$PWD/example.js":/example.js:ro node:10 node example.js 
INIT:	 | TID(5)	| EID(1)
BEFORE:	 | TID(0)	| EID(1)
INIT:	 | TID(6)	| EID(1)
PROM:	 | TID(0)	| EID(1)
INIT:	 | TID(7)	| EID(1)
INIT:	 | TID(8)	| EID(1)
INIT:	 | TID(9)	| EID(1)
INIT:	 | TID(10)	| EID(6)
TOUT:	 | TID(1)	| EID(7)
AFTER:	 | TID(0)	| EID(0) <----

$ docker run -v "$PWD/example.js":/example.js:ro node:9 node example.js 
INIT:	 | TID(5)	| EID(1)
BEFORE:	 | TID(0)	| EID(1)
INIT:	 | TID(6)	| EID(1)
PROM:	 | TID(0)	| EID(1)
INIT:	 | TID(7)	| EID(1)
INIT:	 | TID(8)	| EID(1)
INIT:	 | TID(9)	| EID(5)
INIT:	 | TID(10)	| EID(9)
INIT:	 | TID(11)	| EID(6)
TOUT:	 | TID(1)	| EID(7)
AFTER:	 | TID(9)	| EID(10)

NOTE: with plain promises it still works as expected:

function main() {
  printAsyncIds('BEFORE', triggerAsyncId(), executionAsyncId());
  return sleep(300).then(function() {
    printAsyncIds('AFTER', triggerAsyncId(), executionAsyncId());
  });
}
$ docker run -v "$PWD/example.js":/example.js:ro node:10 node example.js 
BEFORE:	 | TID(0)	| EID(1)
INIT:	 | TID(5)	| EID(1)
PROM:	 | TID(0)	| EID(1)
INIT:	 | TID(6)	| EID(1)
INIT:	 | TID(7)	| EID(1)
INIT:	 | TID(8)	| EID(5)
TOUT:	 | TID(1)	| EID(6)
AFTER:	 | TID(5)	| EID(8)
Qard commented

Confirmed, this breaks any sort of CLS-like system on async_hooks.

const ah = require('async_hooks')
const assert = require('assert')

const cls = {}
const states = new Map()

Object.defineProperty(cls, 'state', {
  get () {
    const asyncId = ah.executionAsyncId()
    return states.has(asyncId) ? states.get(asyncId) : undefined
  },
  set (state) {
    const asyncId = ah.executionAsyncId()
    states.set(asyncId, state)
  }
})

const hook = ah.createHook({
  init (asyncId, type, triggerAsyncId, resource) {
    states.set(asyncId, cls.state)
  },
  destroy (asyncId) {
    if (!states.has(asyncId)) return
    states.delete(asyncId)
  }
})
hook.enable()

cls.state = 'test'

function delay(ms) {
  return new Promise((resolve) => {
    assert(cls.state === 'test', 'state is correct before setTimeout')
    setTimeout(() => {
      assert(cls.state === 'test', 'state is correct before resolve')
      resolve()
      assert(cls.state === 'test', 'state is correct after resolve')
    }, ms)
    assert(cls.state === 'test', 'state is correct after setTimeout')
  })
}

async function main() {
  assert(cls.state === 'test', 'state is correct before await')
  await delay(100)
  // Fails here...
  assert(cls.state === 'test', 'state is correct after await')
}

main().then(
  value => console.log({ value }),
  error => console.error(error)
)

Also, I did a bisect on Node.js and was able to confirm the issue came from the V8 6.6 upgrade.

As a side note: executionAsyncId and triggerAsyncId functions both return 0 at the point where the failure occurs.

Yeah that's clearly a bug in V8 where we don't remember the outer promise for an async function unless the debugger is active. Patch below fixes that issue. @targos can you float that on Node 10?

diff --git a/deps/v8/src/builtins/builtins-async-gen.cc b/deps/v8/src/builtins/builtins-async-gen.cc
index 8036763c83..073c96a2e0 100644
--- a/deps/v8/src/builtins/builtins-async-gen.cc
+++ b/deps/v8/src/builtins/builtins-async-gen.cc
@@ -44,7 +44,7 @@ void AsyncBuiltinsAssembler::Await(Node* context, Node* generator, Node* value,
   // When debugging, we need to link from the {generator} to the
   // {outer_promise} of the async function/generator.
   Label done(this);
-  GotoIfNot(IsDebugActive(), &done);
+  GotoIfNot(IsPromiseHookEnabledOrDebugIsActive(), &done);
   CallRuntime(Runtime::kSetProperty, native_context, generator,
               LoadRoot(Heap::kgenerator_outer_promise_symbolRootIndex),
               outer_promise, SmiConstant(LanguageMode::kStrict));

V8 patch landed in v8/v8@ca76392. Any chance it could be merged to 6.7 and 6.6 upstream?

Upstream fix also landed. Don't think we'll back-merge to 6.6 or 6.7, since this doesn't affect Chrome. @natorion?

I opened a floating patch for it in #20467.

I am wondering if we should add a regression test for this as well. I am somewhat surprised that we did not catch this on our side when updating V8. Thoughts?

A regression test would be nice ๐Ÿ‘

I have a proposal wrt Promise hooks that we would like to discuss in diagnostic WG. For that, use cases would be great too.

@hashseed please invite me to said meeting? I have some ideas I'd like to discuss there about possible use cases for async_hooks and promises.

Would also love to discuss this in the collab summit during @bmeurer's session about promise performance or some other time (maybe in my session).

Said meeting happened two weeks ago. Would love to discuss during collab summit though.

@hashseed my bad and +1 on discussing in the collab summit. Are there any meeting notes I can read to be more prepared?

It looks like this is fixed in 10.4.0 with the update of V8 from 6.6 to 6.7 ๐ŸŽ‰

I think this essentially leaves us with somebody having to write tests for this and then we can close this issue โ€“ right?

The now-obsolete PR to float the V8 fix on 6.6 included a brief test, if that's useful.

Can we get @BridgeAR's simple test included for at least a sanity test?

This is fixed in 10.4.0.