child_process: async spawn methods not closing stdin
silverwind opened this issue · 30 comments
Currently, there is no way to distinguish if the process is being piped/redirected to or if it is spawned by child_process.exec
. The following examples both yield the exact same process.stdin
object, leaving a script undecided whether it should wait for input on stdin if it choses to accept data on stdin:
Attached stdin
echo "something" | iojs -p process.stdin
Spawned by exec
iojs -p "require('child_process').exec('iojs -p process.stdin', function(err,stdout) { process.stdout.write(stdout); })"
If it is possible to distinguish these cases, I'd like to see a new property on process.stdin
that returns true
when the process has its stdin attached, false
if it is not.
related: raineorshine/npm-check-updates#119
cc: @metaraine
cc @bnoordhuis / @piscisaureus?
Currently, there is no way to distinguish if the process is being piped/redirected to or if it is spawned by child_process.exec.
There is no distinguishing between the two just by looking at stdin, it's a pipe in both cases. You could pass a value in the environment as an out-of-band signal but that requires cooperation between the parent and the child.
it's a pipe in both cases
At which point is the pipe introduced in exec
? From what I gather, it's basically /bin/sh -c command
, which seems to be a regular fd to me when I try this:
$ /bin/sh -c "bash -c 'readlink /proc/$$/fd/1'"
/dev/pts/0
Or maybe a better demonstration of what I mean:
$ /bin/sh -c "bash -c 'readlink -f /dev/stdin'"
/dev/pts/0
$ /bin/sh -c "echo 'a' | bash -c 'readlink -f /dev/stdin'"
/proc/2046/fd/pipe:[3219332]
The first case's (which I think is what exec
does) stdin doesn't look like a pipe to me.
I think you subconsciously associate 'pipe' with the pipe character? I mean it in the system call sense, i.e. man 2 pipe
.
Indeed I was, I'll read that up, thanks.
@silverwind Is this still something you'd like to keep open?
I think I found a somewhat workable solution by resolving the stdin symlink. I still don't understand why it logs socket:[26350]
in the child_process case though, as @bnoordhuis mentioned it should be a pipe.
"use strict";
var fs = require("fs");
function resolveLink(link, cb) {
fs.lstat(link, function (err, stat) {
if (err) return cb(link);
if (stat.isSymbolicLink()) {
fs.readlink(link, function (err, linkString) {
resolveLink(linkString, cb);
})
} else {
cb(link);
}
});
};
resolveLink("/dev/stdin", console.log);
$ node log-stdin.js
/dev/pts/1
$ echo "a" | node log-stdin.js
pipe:[28908]
$ node -p 'require("child_process").execSync("node log-stdin.js").toString().trim()'
socket:[26350]
@silverwind execSync()
creates a UNIX socketpair; it's mostly interchangeable with a pipe except it can also be used to send over file descriptors.
@silverwind is this still viable, or should we close it?
@silverwind
all three of those branches print out "fd/0" from the symbolic link branch on osx node 7.4.0
so there's no way to tell (without incorporating an additional flag/option to the child script) if a process is spawned vs being piped input to?
Maybe this will help? I have no idea what's happening in the third case, or what these bytes even mean.
$ node -p process.stdin.constructor.name
ReadStream
$ echo "something" | node -p process.stdin.constructor.name
Socket
$ node -p "require('child_process').execSync('node -p process.stdin.constructor.name', function(err,stdout) { process.stdout.write(stdout); })"
<Buffer 53 6f 63 6b 65 74 0a>
@silverwind That does indeed seem helpful. I wonder if the constructor name is consistent on different platforms?
For your third example, you have to convert a Buffer to a String in order to see the encoded value. In this case, it's identical to example 2, which is unfortunate, as it appears to ambiguate the presence of a value on stdin.
$ node -p "require('child_process').execSync('node -p process.stdin.constructor.name').toString()"
Socket
Ah, of course. Well it won't help in our case:
$ diff <(echo "something" | node -p process.stdin) <(node -p "require('child_process').execSync('node -p process.stdin').toString().trim()")
which gives no output, the objects are the same.
It might be good if somebody could try to concisely say what the actual issue here is? It seems to me like it’s about checking whether a child process has been spawned by node vs some other process? In that case, testing stdio properties won’t be a reliable indicator for anything…
The issue is about finding a reliable way to detect if a script is the target of a shell's pipe, e.g. echo | node script.js
.
@silverwind But shell’s pipes and Node’s pipes are not really different things (apart from the fact that Node prefers socketpairs, but that has been mentioned above already…)?
True. I think the issues boil down to the get-stdin module which, when spawned through child_process
waits for an 'end'
event on stdin that never comes, because as far as get-stdin
is concerned, it's still waiting on data.
I think the proper solution would be to manually close the stdin pipe from the parent. Any idea how?
I think the proper solution would be to manually close the stdin pipe from the parent.
Yeah, I agree. That’s what piping from echo
does, too.
Does execSync
not close stdin after writing the input? That would seem like a bug to me…
@addaleax the issue seems to actually be with exec
not closing it, execSync
works. Example:
child.js
(simply echoes stdin to stdout)
let ret = '';
process.stdin.on('readable', () => {
let chunk;
while (chunk = process.stdin.read()) ret += chunk;
});
process.stdin.on('end', () => {
process.stdout.write(ret);
});
$ node -e "require('child_process').execSync('node child.js')" # works
$ node -e "require('child_process').exec('node child.js')" # hangs
@silverwind Right… exec
gives you a child process object, and calling .stdin.end();
should ”fix” the problem…
I’m not sure there is a way for Node to deviate from requiring an explicit stdin.end();
?
.stdin.end()
works, but interestingly, .execFile
, which also returns a ChildProcess does not require it
Nevermind my last comment, I had the wrong usage. All async methods are affected.
So it looks like the behaviour is actually documented:
Note that if a child process waits to read all of its input, the child will not continue until this stream has been closed via end().
execSync
, execFileSync
and spawnSync
get arount this limitation because they know beforehand when stdin ends through the input
option.
The async methods on the other hand support pushing data to the child's stdin anytime during execution. I can see some use of this for fork
and possibly spawn
, but not so much for exec
and execSync
, which in my eyes are more geared towards being used for simple one-shot commands, which don't involve a stdin pipe.
We could solve it by adding the input
options to exec
, execFile
and possibly spawn
and close stdin once the data has been pushed through. It'd be semver-major.
I wonder if this issue could be solved on the spawned child's end. Take this simple example:
process.stdin.on('end', () => console.log('end'));
process.stdin.resume();
This will not log 'end'
when ran with node script.js
, but will with printf '' | node script.js
. Timeout-based approaches were suggested, but I wonder if there are ways to detect if there's nothing on stdin
that don't involve the unreliable isTTY
property.
I don't understand what problem you're trying to solve. Just call .end() when you have nothing left to send to the child.
Just call .end() when you have nothing left to send
That's what I want to avoid. There's a lot code out there that does not call .end()
and I think the expected behaviour would be that the async spawning methods automatically close stdin, just like sync variants already do.
process.stdin.on('end', () => console.log('end')); process.stdin.resume();This will not log
'end'
when ran withnode script.js
, but will withprintf '' | node script.js
.
I feel the current behavior is the intended one. What I mean is that, currently it is possible to make a functional replica of the cat(1) command (without arguments):
process.stdin.on('data', buf => process.stdout.write(buf));
If this behavior is "fixed", and an end event is automatically emitted when the command is directly called w/o shell pipes, I wouldn't see a way of writing this.
Your other example child.js (#2339 (comment)) is almost an exact replica of the sponge(1) command feature-wise, and the same question may be asked of that as well.
Is this close-able? Or should it stay open?
I don't think there is consensus that this is a bug that needs fixing. I'll close it out.