child_process: async spawn methods not closing stdin

Question

child_process: async spawn methods not closing stdin

silverwind opened this issue 9 years ago · 30 comments

Currently, there is no way to distinguish if the process is being piped/redirected to or if it is spawned by child_process.exec. The following examples both yield the exact same process.stdin object, leaving a script undecided whether it should wait for input on stdin if it choses to accept data on stdin:

Attached stdin

echo "something" | iojs -p process.stdin

Spawned by exec

iojs -p "require('child_process').exec('iojs -p process.stdin', function(err,stdout) { process.stdout.write(stdout); })"

If it is possible to distinguish these cases, I'd like to see a new property on process.stdin that returns true when the process has its stdin attached, false if it is not.

related: raineorshine/npm-check-updates#119
cc: @metaraine

Answer 1 · 2015-08-11T18:14:38.000Z

cc @bnoordhuis / @piscisaureus?

Answer 2 · 2015-08-11T18:36:54.000Z

Currently, there is no way to distinguish if the process is being piped/redirected to or if it is spawned by child_process.exec.

There is no distinguishing between the two just by looking at stdin, it's a pipe in both cases. You could pass a value in the environment as an out-of-band signal but that requires cooperation between the parent and the child.

Answer 3 · 2015-08-11T19:03:23.000Z

it's a pipe in both cases

At which point is the pipe introduced in exec? From what I gather, it's basically /bin/sh -c command, which seems to be a regular fd to me when I try this:

$ /bin/sh -c "bash -c 'readlink /proc/$$/fd/1'"
/dev/pts/0

Answer 4 · 2015-08-11T19:12:45.000Z

Or maybe a better demonstration of what I mean:

$ /bin/sh -c "bash -c 'readlink -f /dev/stdin'"
/dev/pts/0

$ /bin/sh -c "echo 'a' | bash -c 'readlink -f /dev/stdin'"
/proc/2046/fd/pipe:[3219332]

The first case's (which I think is what exec does) stdin doesn't look like a pipe to me.

Answer 5 · 2015-08-11T19:23:44.000Z

I think you subconsciously associate 'pipe' with the pipe character? I mean it in the system call sense, i.e. man 2 pipe.

Answer 6 · 2015-08-11T19:26:20.000Z

Indeed I was, I'll read that up, thanks.

Answer 7 · 2016-03-14T21:41:34.000Z

@silverwind Is this still something you'd like to keep open?

Answer 8 · 2016-03-14T23:22:28.000Z

I think I found a somewhat workable solution by resolving the stdin symlink. I still don't understand why it logs socket:[26350] in the child_process case though, as @bnoordhuis mentioned it should be a pipe.

"use strict";
var fs = require("fs");

function resolveLink(link, cb) {
  fs.lstat(link, function (err, stat) {
    if (err) return cb(link);
    if (stat.isSymbolicLink()) {
      fs.readlink(link, function (err, linkString) {
        resolveLink(linkString, cb);
      })
    } else {
      cb(link);
    }
  });
};

resolveLink("/dev/stdin", console.log);

$ node log-stdin.js
/dev/pts/1
$ echo "a" | node log-stdin.js
pipe:[28908]
$ node -p 'require("child_process").execSync("node log-stdin.js").toString().trim()'
socket:[26350]

Answer 9 · 2016-03-15T15:35:15.000Z

@silverwind execSync() creates a UNIX socketpair; it's mostly interchangeable with a pipe except it can also be used to send over file descriptors.

Answer 10 · 2016-12-05T16:56:32.000Z

@silverwind is this still viable, or should we close it?

Answer 11 · 2017-02-11T19:51:58.000Z

@silverwind
all three of those branches print out "fd/0" from the symbolic link branch on osx node 7.4.0

so there's no way to tell (without incorporating an additional flag/option to the child script) if a process is spawned vs being piped input to?

Answer 12 · 2017-02-12T10:25:24.000Z

Maybe this will help? I have no idea what's happening in the third case, or what these bytes even mean.

$ node -p process.stdin.constructor.name
ReadStream
$ echo "something" | node -p process.stdin.constructor.name
Socket
$ node -p "require('child_process').execSync('node -p process.stdin.constructor.name', function(err,stdout) { process.stdout.write(stdout); })"
<Buffer 53 6f 63 6b 65 74 0a>

Answer 13 · 2017-02-12T13:57:40.000Z

@silverwind That does indeed seem helpful. I wonder if the constructor name is consistent on different platforms?

For your third example, you have to convert a Buffer to a String in order to see the encoded value. In this case, it's identical to example 2, which is unfortunate, as it appears to ambiguate the presence of a value on stdin.

$ node -p "require('child_process').execSync('node -p process.stdin.constructor.name').toString()"
Socket

Answer 14 · 2017-02-12T14:11:26.000Z

Ah, of course. Well it won't help in our case:

$ diff <(echo "something" | node -p process.stdin) <(node -p "require('child_process').execSync('node -p process.stdin').toString().trim()")

which gives no output, the objects are the same.

Answer 15 · 2017-02-12T15:28:09.000Z

It might be good if somebody could try to concisely say what the actual issue here is? It seems to me like it’s about checking whether a child process has been spawned by node vs some other process? In that case, testing stdio properties won’t be a reliable indicator for anything…

Answer 16 · 2017-02-12T15:58:56.000Z

The issue is about finding a reliable way to detect if a script is the target of a shell's pipe, e.g. echo | node script.js.

Answer 17 · 2017-02-12T16:06:33.000Z

@silverwind But shell’s pipes and Node’s pipes are not really different things (apart from the fact that Node prefers socketpairs, but that has been mentioned above already…)?

Answer 18 · 2017-02-12T16:41:59.000Z

True. I think the issues boil down to the get-stdin module which, when spawned through child_process waits for an 'end' event on stdin that never comes, because as far as get-stdin is concerned, it's still waiting on data.

I think the proper solution would be to manually close the stdin pipe from the parent. Any idea how?

Answer 19 · 2017-02-12T16:46:51.000Z

I think the proper solution would be to manually close the stdin pipe from the parent.

Yeah, I agree. That’s what piping from echo does, too.

Does execSync not close stdin after writing the input? That would seem like a bug to me…

Answer 20 · 2017-02-12T17:09:56.000Z

@addaleax the issue seems to actually be with exec not closing it, execSync works. Example:

`child.js` (simply echoes stdin to stdout)

let ret = '';
process.stdin.on('readable', () => {
  let chunk;
  while (chunk = process.stdin.read()) ret += chunk;
});
process.stdin.on('end', () => {
  process.stdout.write(ret);
});

$ node -e "require('child_process').execSync('node child.js')" # works
$ node -e "require('child_process').exec('node child.js')" # hangs

Answer 21 · 2017-02-12T17:14:01.000Z

@silverwind Right… exec gives you a child process object, and calling .stdin.end(); should ”fix” the problem…

I’m not sure there is a way for Node to deviate from requiring an explicit stdin.end();?

Answer 22 · 2017-02-12T17:20:09.000Z

.stdin.end() works, ~~but interestingly, execFile, which also returns a ChildProcess does not require it~~.

Answer 23 · 2017-02-12T17:23:33.000Z

Nevermind my last comment, I had the wrong usage. All async methods are affected.

Answer 24 · 2017-02-12T18:02:25.000Z

So it looks like the behaviour is actually documented:

Note that if a child process waits to read all of its input, the child will not continue until this stream has been closed via end().

execSync, execFileSync and spawnSync get arount this limitation because they know beforehand when stdin ends through the input option.

The async methods on the other hand support pushing data to the child's stdin anytime during execution. I can see some use of this for fork and possibly spawn, but not so much for exec and execSync, which in my eyes are more geared towards being used for simple one-shot commands, which don't involve a stdin pipe.

We could solve it by adding the input options to exec, execFile and possibly spawn and close stdin once the data has been pushed through. It'd be semver-major.

Answer 25 · 2017-02-17T10:26:41.000Z

I wonder if this issue could be solved on the spawned child's end. Take this simple example:

process.stdin.on('end', () => console.log('end'));
process.stdin.resume();

This will not log 'end' when ran with node script.js, but will with printf '' | node script.js. Timeout-based approaches were suggested, but I wonder if there are ways to detect if there's nothing on stdin that don't involve the unreliable isTTY property.

Answer 26 · 2017-02-17T11:49:27.000Z

I don't understand what problem you're trying to solve. Just call .end() when you have nothing left to send to the child.

Answer 27 · 2017-02-17T19:19:22.000Z

Just call .end() when you have nothing left to send

That's what I want to avoid. There's a lot code out there that does not call .end() and I think the expected behaviour would be that the async spawning methods automatically close stdin, just like sync variants already do.

Answer 28 · 2017-02-19T07:04:37.000Z

process.stdin.on('end', () => console.log('end'));
process.stdin.resume();
This will not log 'end' when ran with node script.js, but will with printf '' | node script.js.

I feel the current behavior is the intended one. What I mean is that, currently it is possible to make a functional replica of the cat(1) command (without arguments):

process.stdin.on('data', buf => process.stdout.write(buf));

If this behavior is "fixed", and an end event is automatically emitted when the command is directly called w/o shell pipes, I wouldn't see a way of writing this.

Your other example child.js (#2339 (comment)) is almost an exact replica of the sponge(1) command feature-wise, and the same question may be asked of that as well.

Answer 29 · 2017-07-29T04:01:14.000Z

Is this close-able? Or should it stay open?

Answer 30 · 2017-07-29T08:39:03.000Z

I don't think there is consensus that this is a bug that needs fixing. I'll close it out.

Attached stdin

Spawned by exec

child.js (simply echoes stdin to stdout)

`child.js` (simply echoes stdin to stdout)