nodists/nodist

SIGTERM code does not propagate well between layers

osher opened this issue · 18 comments

osher commented

SIGTERM code does not propagate well between layers
C:\Program Files (x86)\Nodist\bin\node.exe drops the ball with SIGTERM signals.

After fiddling, here's a work around:
When I set my env. var PATH to skip nodist\bin\node.exe it works as expected,
(i.e. path includes: C:\Program Files (x86)\Nodist;C:\Program Files (x86)\Nodist\bin;)
where with the value provided by the installer - it does not work and the problem is reproduced.
(i.e. path includes only: C:\Program Files (x86)\Nodist\bin;)

Full details bellow.

Osher

> nodist -v
0.8.4

windows 10.

I learnt that nodist is built like onion, where a process spawns a process with many pipes between.
I'm not sure what's the role of each layer - but. here's the story:

I'm working on a code-generator for web-servers.
The generator expects an open-api spec-file, and generates:
a) a project that answer mock replies
b) test suite for the core parts
c) end-to-end test suite that the server passes with mock replies that meet the spec.
Then developers can enter the project and replace mocks with real logic, and keep tests passing.

And I code the generator with with bdd, so I have an end-to-end test for the generator itself, that:

  1. use child_process.exec Run the generator with a given openapi-spec file

  2. use child_process.exec Run the generated unit-tests

  3. use child_process.exec to run the generated end-to-end tests in a mocha suite that:

    3.1. use a before-all hook to lunch the server using child = child_process.spawn('node', ['server.js'], { cwd: targetDir })
    3.2. fire all the generated requests using request/request
    3.3. use an after-all hook to child.kill('SIGTERM') and collect the logs it emits through it's well-handled graceful shutdown.

Then I noted that ever since I started to use nodist the server lunched for the generated end-to-end test in step 3.1 does not get the SIGTERM command.
I wasted on it half a day trying to isolate the problem...

To make a long story short, I found the problem is based on the node-process that is started using the child_process.spawn comand.
When I used absolute paths to concrete node version distributables - the child process heard the SIGTERM, and all was dandy.
(i.e. - spawn(["c:\\node-dist\\node-v6.5.0-win-x64\\node.exe",...)
But when I just used child_process.spawn('node'... - the problem was reproduced.

I almost gave up on nodist and moved to use my own simple repository of node versions, but on the last moment I noted there's node.exe in many places:

  • nodist\bin\node.exe
  • nodist\node.exe
  • nodist\v-x64\node.exe

(Now I also assume that they call each other in this order)

So I decided to try absolute path to concrete version from the nodist\v-x64. and it worked.
(i.e. - spawn(["C:\\Program Files (x86)\\Nodist\\v-x64\\6.5.0\\node.exe",...)
I tried absolute path to nodist\node.exe - and it worked.
I tired absolute path to nodist\bin\node.exe - and the problem was reproduced, the same way as it behaved when I spawn(["node", ....

I went to check my PATH variable.
I kept the code with spawn(["node", ... - i.e. - let the OS seach for node in %PATH%.

When I set my env. var PATH to skip nodist\bin\node.exe it works,
(i.e. path includes: C:\Program Files (x86)\Nodist;C:\Program Files (x86)\Nodist\bin;)
where with the value provided by the installer - the problem is reproduced.
(i.e. path includes only: C:\Program Files (x86)\Nodist\bin;)

Hi @osher!
Nice catch there (and sorry for the inconvenience)! The problem is that nodist uses a binary shim in order to be able to determine node version per directory and other cool stuff. Apparently the shim doesn't proxy the signals. Will look into how this can be done in go. If you'd like to have a go (no pun intended), here's the offending section: https://github.com/marcelklehr/nodist/blob/master/src/shim-node.go#L74

osher commented

very tempting, but I'm on a rough schedule.
Go will have to wait for a better time for me 😢

Personally I don't understand why there should be 3 layers, I think 2 should be enough:
one to do all the resolving involved (global/local versions ...whatever and all)
- and -
the actual version

And it does not matter in what language its implemented as long as it WORKS.

That's exactly how it is. Two layers. Where do you get the third one from? :)

Nodist\node.exe is the version that nodist uses internally, it shouldn't be used by anything else.

This is fixed on the master branch. I'm still waiting on some feedback for changes to the docs and will release a new version shortly.

osher commented

roger. I'm watching.
I'm waiting for the new version so I can handle #174 on same effort

osher commented

I just installed 0.8.7
I need to reopen this one... how can I do it?

Confirmed.

It appears signals in golang on windows is not the same thing as signals in node on windows: golang/go#6720 Since there are no signals on windows, Node probably uses some custom hack to send signals. I haven't found out what it is, yet, though. sigh

As a workaround you could use node's IPC messaging API to send singal-like messages.

osher commented

Well. I'm not sure that considerable.

I'm basically testing for 3 things:

  1. the server starts well - covered by the generated test before-all hook
  2. the server works as expected - with tests the developer adds)
  3. the server terminates upon a kill -s SIGTERM signal - covered by the generated test after-all hook

So using custom IPC messages defeats the purpose here...

My bad: In your specifc case, IPC messages is not useful, i agree. If someone else is quicker than me at finding out how node handles/implements signals on windows, please let me know here, cause I'm a little swamped, right now.

osher commented

AFAIK - it has to do with libuv. I'll try to get more details.

A workaround for this might be to spawn instances of process.execPath directly instead of going through the node shim again, although this of course only works when you're the one who actually spawns the process and not some script.

osher commented

Hmm. I basically use a tool which I facilitated out of my example at #179.

But this may still open a possibility. I'll try to PR them to use process.execPath

@osher Take a look at https://www.npmjs.com/package/infant

It might help with your signaling and works fine in Windows. All my Node apps run on Infant on top of Nodist.

Thanks

osher commented

@nullivex Thanks for the ref. its well made, but looks overkill to me. process.execPath did the trick.

@osher: I tried adding C:\Program Files (x86)\Nodist; in front of C:\Program Files (x86)\Nodist\bin; for the system Path variable, restarted my Windows 10, and had the same error
npm-debugModifiedPath.txt; I tried to add C:\\Program Files (x86)\\Nodist\\v-x64\\6.5.0\\node.exe or C:\Program Files (x86)\Nodist\v-x64\5.2.0\node.exe and still got the same. Have I misinterpreted your solution?

osher commented

woups. sorry for the late response.

Anyway, what I did was against the way nodist should be used, and by doing so I gave up some of it's abilities just to work around and get bye. I'm not working like that anymore - I had to implement other work-arounds (namely using IPC
parent: child = fork(...); .... ; child.send('die')
child: process.on('message', (m) => { if (m == 'die') process.emit('SIGTERM') })
which luckily I could do, because I'm in control of the target script.

Also - using fork makes sure they use the target node version without going through the shim, which by itself solves a part of the problem, even if no messages are exchanged between the two

So I'm not sure that you and I experience the same problem, so I can't really help 😢

FYI, since having WSL, here is the related issue:

"Support job control for Windows programs invoked under Bash (ctrl-c, ctrl-z support) microsoft/WSL#1614