nodejs/node-v0.x-archive

child_process.spawn ignores PATHEXT on Windows

OrangeDog opened this issue ยท 67 comments

For example require('child.process').spawn('mycmd') won't find C:\util\mycmd.bat when PATH contains C:\util and PATHEXT contains .BAT.

Ye olde code (https://github.com/joyent/node/blob/v0.4/src/node_child_process_win32.cc) looks like it would have worked, but I have no idea where the v0.6 equivalent is.

At the moment child_process.spawn() can only run exe files. This is a limitation of the CreateProcess API. child_process.exec() can run batch files though. We could make libuv prefix cmd /c if the file is not an exe - but iirc Peter was strongly against that.

@DrPizza @igorzi thoughts?

@OrangeDog as a workaround, you could use

require('child_process').spawn('cmd', ['/s', '/c', '"C:\\util\\mycmd.bat"'], { 
  windowsVerbatimArguments: true
});

(this option is internal and not guarateed to stick btw)

Not very helpful for writing portable code though.

@OrangeDog Well you can't really write a portable batch file anyway.

@DrPizza suggested today that we could add a { shell: true } option to spawn. I kind of like the idea. It allows using spawn for the same purpose as exec without buffering all the output. We also currently have the weird distinction between exec and execFile; we could just make those the same function but with a different default for the shell option. @ry, @bnoordhuis, what do you guys think?

It'd be superfluous on Unices - the two things a shell script needs is a valid shebang and the executable bit set.

@bnoordhuis not completely because people may want to do ls -r | grep bla

If I can interject into this conversation; the shebang indicates which executable will actually run the file and the executable bit flags that the user (or a user) has granted permission to run the file (probably why auto-executing of shell scripts in the Windows %PATH% was objected to before, since there's no executable bit there).

Perhaps a { shell: "shellname" } option would be better? This would indicate which program you want to pass the file to for execution (being explicit that you do want to execute a script rather than an application), and it could still be useful for Unices.

Basically, the shebang line is invalid syntax in Javascript, but Node.js tolerates it because Unix shells will automatically interpret that line and pass the file to the specified program -- but then that means you can't use that JS file in a browser without modification. A developer using browser-require might instead prefer to spawn the script with { shell: "node" } instead and not include the shebang at all.

EDIT: Of course, in Unix, a shebanged shell script could still be run the normal way.

@piscisaureus but you can write batch/perl/etc. scripts to allow spawn('scp') or spawn('readlink') to behave in the same way on Windows as on *nix.

I thought the whole point of adding libuv was to give portable cross-platform support.

@piscisaureus Is this still an issue?

I think @piscisaureus's suggestion is on to something.

It's kind of annoying right now that there's a "run in a shell" function that buffers the output (exec), and a "run as-is" option that doesn't (spawn), and a "run as-is" that buffers the output (execFile), but no "run in a shell" that doesn't buffer the output.

However, it can't be spawn(cmd, args, {shell: true}), because that doesn't really work for the ls -laF | grep foo case, right? The cmd is either "sh" or "cmd" depending on platform, and the arg is always either /c $cmd or -c $cmd.

What about this?

child = child_process.spawnShell('util.bat glerp gorp', {options...})

Which would be sugar for:

child = child_process.spawn(isWin ? 'cmd' : 'sh', [isWin?'/c':'-c', arg], options)

So, I don't understand anything about shells on Unix, so take this with a grain of salt. In particular I don't understand how sh -c would affect things. I guess maybe that would let it run chmodded shebanged files?

But for me the problem is that there is lots of published code out there that is not cross-platform because of this. (npm with git and npm-www with couchdb recently, but much more I've seen) A fix that retroactively makes all that code work would be ideal, instead of throwing a new method in the already-confusing medley of child_process and having to evangelize "use this if you want your code to work on Windows."

So I'd say

We could make libuv prefix cmd /c if the file is not an exe - but iirc Peter was strongly against that.

is the most useful thing I've seen so far. Alternately, if there's an alternative to the CreateProcess API that doesn't have this suckiness, that would be nice.

@domenic sh -c would allow you to do shell syntax stuff, pipes and whatnot. sh -c "ls -laF | grep foo > output.log" would list all the files, search the output lines for "foo" and then write the results to output.log. "ls -laF | grep foo > output.log" isn't an executable name, it's a shell program. (An executable name is also a valid shell program, of course.)

If we start wrapping every command in a shell, then that's not so great.

@isaacs I see. Then that seems like a fairly orthogonal concern to the fact that Windows executables come in many flavors (exe, bat, cmd, etc.). People using spawn will not expect shell syntax, but they will expect that spawn("couchdb") works cross-platform without any extra { shell: true } or spawnShell or the like.

It sounds like spawnShell would be independently useful for the

"run in a shell" that doesn't buffer the output

case, but regardless I think spawn should work with .bats, .cmds, etc.

libuv used to use PATHEXT, but this was reverted in joyent/libuv@8ed2ffb because it didn't work for non-exe files. I would be ok with putting this back in and running all non-exe files with "cmd /c". I take patches. (Note that the escaping rules are quite complicated when running stuff with cmd /c)

@piscisaureus Would using ShellExecuteEx instead of CreateProcess be a good route toward this?

Otherwise, if I were to work on a patch for this, would you suggest it as a libuv patch or Node patch? Probably libuv, right?

@domenic Afaik ShellExecuteEx doesn't allow redirection of stdio handles.

The problem seems to be that child_process got a bit muddied up trying to handle some of these cases for shell commands, but we've reached a point where it's getting uncomfortable to continue to add shims in there to support more shell-friendly options. I don't think this is a cross-platform problem. It's a problem of executables vs. shell scripts. This isn't a problem for most Unix devs, because they already know the difference between the two.

I'm primarily a Windows guy myself, but it doesn't seem fair that Windows should get special treatment at the child_process level just because cmd and bat files are treated as shell commands by the OS. Windows devs will just need to learn the difference between an executable and a shell command for their platform. (This isn't just a problem with Node, BTW. I've seen this play out in other platforms, too.) We need to respect Windows's idea of what is executable, and what is not, and be careful not to break that.

I initially liked the idea of adding a {shell:true} sort of option to spawn(), but after looking at the code, I'd expect it to be supported in exec() and execFile(), too, and that would break things. (We could have different defaults for spawn() and exec(), but that's still pretty confusing.)

Rather than try to make the mess of spawn(), exec(), and execFile() do even more black magic, and risk making things even more unclear, I'd prefer that we keep things honest and add @isaacs's suggestion of adding a spawnShell() method. If we clean up the docs a bit, and make it more clear what each of these methods does, we've got a shot at having a fairly complete set of use cases.

In the long term, it might be worth considering the deprecation of exec() and execFile(), in favor of a set of functions with a more deliberate separation between executables and shell commands. The code would get a lot cleaner, and with just slightly better documentation and error messages, it would also be an easier API for new developers to understand.

I'd like to add my 2 cents with regard to using

require('child_process').spawn('cmd', ['/s', '/c', '"C:\\util\\mycmd.bat"']);

on windows. This is giving me a real headache as doing this with long running processes results in those processes being orphaned when I call child.kill() ie. the cmd process is killed but the process it spawns is not.

See this gist for an example:

https://gist.github.com/4117163

I'm still trying to figure out if this is a bug that should be raised separately (child.kill seems to deliberately not kill grandchildren (which is annoying also) and this might be considered a grandchild)

@piscisaureus Do you have any news with this issue? I've noticed that you said 'Yes, I am actively working on that.', in this comment a year ago.

I've digged into this issue a bit and here are all the information I gathered:

  1. require('child_process').exec does the following:

      if (process.platform === 'win32') {
        file = 'cmd.exe';
        args = ['/s', '/c', '"' + command + '"'];
        // Make a shallow copy before patching so we don't clobber the user's
        // options object.
        options = util._extend({}, options);
        options.windowsVerbatimArguments = true;
      } else {
        file = '/bin/sh';
        args = ['-c', command];
      }
    
  2. There is no argv on Windows - OS passes the full command line to application and it's up to it (or its runtime library)
    to process this string in any way it wants. That means, that different processes can have different escaping rules theoretically.
    But, there are formal rules from Microsoft that Visual C++ runtime uses.
    As almost everything is compiled using this runtime (MinGW GCC uses it too), I think we can assume that each executable follow these rules.
    That's how arguments of spawn are escaped by libuv, if windowsVerbatimArguments == false. Otherwise, it just joins all arguments with spaces.

  3. In cmd.exe /s /c the second argument (/c) is mandatory. It means 'execute the following command' and exit.

  4. /s is not mandatory, but it's required in a very specific case.
    Tl;dr: you have both c:\a.exe and c:\a b\c.exe, and if you run cmd /c "c:\a b\c.exe", it will not cut out quotes, but if use add /s it will run c:\a.exe with b:\c.exe as an argument.

  5. cmd.exe has its own rules for escaping - ^ is used as escape symbol. For example, if I run cmd.exe /c "echo 1^^2", it will print 1^2. &<>()@^| have
    special meaning too and should be escaped if we don't want them to be interpreted by cmd.exe.

  6. So, if you run executable using cmd.exe, you should escape your arguments twice, in order: first for your runtime library and then for cmd.exe. But, as the former escapes
    space characters, quotemarks, and backslashes; the former escapes another set of characters; we have the right to do escaping in reverse order (escape for cmd.exe first,
    and escape for VCRT then).

  7. spawn('a.bat') works perfectly - it spawns cmd.exe despite it is not explicitly specified it. It's behavior of CreateProcess - if you specify a.bat in both lpApplicationName and lpCommandLine,
    it will run command interpreter. Unfortunatelly, I haven't found a place where such behavior is documented. This doesn't work with another scripts, though - no .vbs, no .js. But still works with .cmd files.

  8. cmd.exe do something weird with ^ in batch files. It looks like parameters are unescaped several times:

    1. When you run cmd.exe /s /c
    2. When some command is run from inside the batch file

    As a consequence, if you have, say, two nested batches (x.bat calls y.bat, which calls z.exe), then you need to run cmd.exe /s /c "x.bat ^^^^^^^^" (yes, eight) to get z.exe ^ started:
    cmd.exe /s /c cuts half of them, then cuts another half while expanding arguments for call y.bat, and the new cmd.exe cuts another half. My gist.
    I was unable to pass odd number of ^ to z.exe through two batch.

  9. But cmd.exe behaves differently if ^ is inside quotation marks. I was unable to understand the logic behind this.

  10. Summarizing: if you have ^ in your arguments, you're gonna have a bad time, because exec always run cmd.exe and do not escape ^. But spawn, in contrast, may or may not
    use cmd.exe and thus, may or may not require escaping of ^. But if you don't use ^, you can always use one of two options;

    1. spawn('a.bat', [/*...*/]) and everything will be escaped properly to work as executable's parameter. You shouldn't use parameters inside batch itself (for example, try to calculate parameter's length
      using embedded cmd.exe commands, because it know nothing about backslashes). I don't know cases where you use parameters not to run external commands, because both type and echo are executables.
      Only some kind of for, may be. I think it's one of the best workarouns for now, despite I don't know why it works (I've tested on Windows XP, Windows Server 2003 and Windows 7)
    2. spawn('c.cmd', ['/s', '/c/', '"a.bat ' + args + '"'], {windowsVerbatimArguments: true}), if you carefully escape args for use with VCRT by yourself.
  11. For example, in bower/bower#626 spawn(which.sync(command). [/*...*/]) was considered as a workaround.

  12. Unfortunatelly, there is no universal escaping solution.

I know two modules that partially solve this issue:

  1. child-proc. It uses windowsVerbatimArguments (which is undocumented) and do not perform escaping. This can
    lead to differences in behavior with child_process. Say, the following code:

    var spawn = require('child-proc').spawn;
    spawn('bats\\a.bat b\\b.bat').stdout.pipe(process.stdout);
    

    runs a.bat, but if you replace child-proc by child_process, b.bat will be started.

  2. spawn-cmd. It do not concatenate arguments by itself and do not use /S. However,
    this is not a problem, because cmd.exe do not remove quotes if the first character is not a quote, and file names rarely start with quote (UPD: except when they have spaces in path, in that case file name is quoted by libuv and we have a bad time).
    The only problem with it is that presence of process.env.compspec is used instead of os.platform() to detect Windows (UPD: fixed in 0.0.2)

win-spawn was my attempt at this, although reading your last post (would probably be worth making into a blog post somewhere) I realise there would be a lot more work to do, it was just a quick hack to solve the problem I was having at the time.

Thanks for this thread it was a real time saver. @ForbesLindesay win-spawn worked like a charm, thanks.

Wow, I've found a real-life case when /S and quoting the whole command may be considered interesting: cmd /C "C:\Program Files (x86)\Git\cmd\git.EXE" clone "hello world" - in this example, cmd cuts first and last quotes, leaving us with 'C:\Program' is not recognized as ...

I've described this in featurist/spawn-cmd#3

Still experiencing problems with this. What will we do about it?

I've made a module out of the very well written (I think) superspawn.js by Andrew Grieve which is part of Apache Cordova cli/coho. It works great so far. I think it's a very good starting point for a userland solution to this problem. Plus it uses promises. Please feel free to try it and to contribute.
https://github.com/MarcDiethelm/superspawn

It'd probably the simplest solution to change the CreateProcess call to something
like this pseudocode:

var com = getenv("COMSPEC");
var args = "/C \" + to_be_executed_commandline + "\"";
CreateProcessW(com, args)

This should execute to_be_executed_commandline as plain command.
As, according to cmd /help the normal behavior is to execute everything
between first and last quote:

    2.  Otherwise, old behavior is to see if the first character is
        a quote character and if so, strip the leading character and
        remove the last quote character on the command line, preserving
        any text after the last quote character.

This could as well make much of the pre-processing in process.c obsolete.

@WernerWenz in that case you should care about escaping arguments to the command by yourself. If I call spawn('rm', ['-rf', 'Hello World']) I expect that folder Hello World will be deleted. Thus, node has to escape space in the argument and call something like rm with arguments -rf "Hello World"

So, that's absolutely not the case. If you don't need arguments escaping, you can use child_process.exec, which works with PATHEXT correctly.

@yeputons unless I'm missing anything, it should perfectly work, as long you construct to_be_executed_commandline
the following way:

var  to_be_executed_commandline  = "\"" + escape(command) + "\"";
foreach (var arg in args) {
    to_be_executed_commandline   +=  "\"" + escape(arg)+ "\"";
}

This effecticley should happens to do the very same as typing

"command" "arg0" "arg1" ... "argn"

in a cmd shell.

As escape() you'd probably want to escape " as well as @<>|& (and possible other control sequences, I'm currently not aware) of.

process.c already seems to have the required logic to at least quote args via quote_cmd_arg.

Not escaping control chars would make

spawn('echo', ['Hello World', '>test.txt'])

print "Hello World" to a file, while escaping would print "Hello World >test.txt" to stdout.

The brute force method for escaping would be appending ^ before each character.

http://qntm.org/cmd shows a sample for escaping only the control characters.

http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/everyone-quotes-arguments-the-wrong-way.aspx gives quite some more in depth information about how cmd handles args.

@WernerWenz yep, that's it, as long as you deal with escaping manually. But why use spawn then and not exec? My point is that on Linux spawn works perfectly with any arguments, which are passed to applications as is (i.e. calling rf -rf "Test folder" does not require any quotes), because arguments are directly passed to applications without command line escaping/parsing, which is not the case on Windows. I think we would like same behavior on both OSes and spawn('echo', ['Hello World', '>test.txt']) should print Hello World >test.txt, because stream redirection is done by shell, not by application itself.

The Problem I've with employing exec over spawn is that I don't use node.js directly myself.
I'm having actually trouble with using a grunt task that happens to be a .bat rather than an .exe

Trying to work around the issues in the task runner is likely less straight than getting a proper and more consistent solution for spawn on windows.

I'm the same opinion regarding the echo example.

I'm however not sure how to deal with UV_PROCESS_WINDOWS_VERBATIM_ARGUMENTS in this context. It seems to override the escaping process.

It's also important to note, that when launching through cmd, that

spawn('hello.exe', ['^X'])

will require proper escaping. Otherwise cmd will invoke hello with X as arg rather than pass ^X.
Same is true for any other control sequences, as hello.exe shall be executed with the arguments as they are supplied.

UV_PROCESS_WINDOWS_VERBATIM_ARGUMENTS most likely would become deprecated/useless if I'd redirect the process creation though cmd.

Maybe it's just me, but this doesn't seem like a single issue, but 5 separate ones, all related to the way windows works:

  1. spawn does not use PATHEXT
    #2318 (comment)
  2. spawn causes wierd escape issues due to the way windows executes commands in a shell
    #2318 (comment)
  3. spawn doesn't execute non-executables in the windows shell (and this may not be desirable)
    #2318 (comment)
  4. spawn can potentially orphan grandchildren
    #2318 (comment)
  5. some node users want additional features, specifically related to the difference between spawn and exec
    #2318 (comment)

I know it's all related, but can we split this into smaller, more manageable chunks?

@deltreey

  1. It's not an issue with spawn or exec at all, I was just exploring different options of fixing the main one and found out that what exec do is very important in some cases, thus I've noted this case in my comment

  2. Btw, it's closely related to PATHEXT. There are just two options for resolving this point and no.1 - we either pass each call to cmd.exe or manually process PATHEXT and call CreateProcess, which, by coincedence, is able to run batch files.

4-5) Agree

I just can't believe this is still open.
What can we do in order for it getting resolved?

@yeputons With regard to 3: am I reading that correctly? Are you saying that use of PATHEXT (automatically via cmd.exe) and opening batch files are mutually exclusive? I just tested it and had no issues running cmd.exe /c test.bat

Also, since you're still active on this thread, I wanted to mention I disagree with your second item from #2318 (comment)

I don't like the idea of assuming people are following some standards...no one ever follows the rules.

and with regard to your number 6, I don't understand why we want to do this escaping in reverse order. If your claim is that we're only "sometimes" running cmd.exe, then I would ask what cases it would be a bad idea to run cmd.exe. Either way, I think the whole escaping of arguments issue really is a whole separate problem because, as you explained, windows executables can choose to read the arguments in any way they choose. In one program, arguments might be done with - or -- (requiring an escape character to use this elsewhere) and in another / or \ and so escaping the arguments for the executable might be best left to the caller and node can then assume incoming arguments are correctly escaped fro the caller and only worry about the weird ^ escapes for cmd.exe

@Anachron Submit a pull request...

@deltreey

  1. No, but "passing commands to cmd.exe" is a superset of "using PATHEXT and CreateProcess". The former allows you more to do and it corresponds with exec's behavior, that's why I think that separating these issues is no good.

Unfortunatelly, it's true - people do not follow standards. But we have to follow some standard of escaping or, at least, invent our own. Otherwise spawn would be no use at all. I just would like to follow 'official' (in some way) standard rather than inventing the wheel.

Yes, it's a separate issue, I agree.

Based on @WernerWenz comment I've made cross-spawn that uses cmd /s /c along with the escaping strategy mentioned in http://qntm.org/cmd.

A somewhat extensive set of tests are all passing on windows.

It's been years and this is still a bug, has it been fixed in io.js btw?

Just reiterating that this issue is still well and alive.

hi, any news?

@orangemocha ... would you have an opportunity to look at this one?

Yes, it's on my queue and I am aware of its importance.

Ok. Just checking in :)

Is this ever going to be fixed in v0.12.x or just in the latest version?

@waynebloss ... unclear. hopefully this one will get done for v0.12 but it's going to depend entirely on @orangemocha's availability to take a look... unless someone else is able to step up and help resolve it.

Any update on this bugfix? Just ran into it today and am going to have to do some hacky things to get this to work with Windows.

In my case, I tried to package what is basically the boilerplate package, and whenever I removed the --icon-icon.ico flag, I stopped getting this error. Not sure if that is of any value to debugging this issue.

Is this fixed by nodejs/node#4598 ?

+1. I have to do this to run Windows binaries that do not end in .exe:

if (/win/.test(os.platform())) {
  command += ".cmd"
}
spawn(command, [ ... ])

Does this still occur with current versions of Node.js? I donโ€™t think thereโ€™s an open issue for this in the active repository at https://github.com/nodejs/node/issues. So either this has been fixed, or you may want to report it there, this repo here is no longer in use.

timdp commented

@tjwebb That won't work with .bat or anything else in PATHEXT. The most reliable solution is to use cross-spawn.

+1 IMO this is an issue. If you can install something globally with npm then you should be able to execute it using spawn() in any OS. In my case I have to execute 'node node_modules/gulp/bin/gulp' instead of just 'gulp' - this is ugly in my integration tests that need to be run in windows....

This issue was resolved elegantly by adriaanthomas/npm-pkgr@4663185 , @adriaanthomas ๐Ÿ‘

e.g. npm, just replace

spawn('npm', ...

by

spawn(/^win/.test(process.platform) ? 'npm.cmd' : 'npm', ...

@flyskywhy That's not actually a "fix", it is an ugly workaround in userland code that should be completely unnecessary if this Node.js issue was actually fixed.

In any case, as several people have explained, this issue is due to OS discrepancies in how they handle shell commands and executable files. I believe the closest thing to a real fix is the recently introduced child_process.spawn()'s shell option, as documented here. If this issue tracker is still maintained, I believe this issue can be closed.

@UltCombo , OK, for short ๐Ÿ˜„ :

e.g. replace spawn('npm', ['-v'], {stdio: 'inherit'}) with:

  • for all node.js version:

    spawn(/^win/.test(process.platform) ? 'npm.cmd' : 'npm', ['-v'], {stdio: 'inherit'})
    
  • for node.js 5.x and later:

    spawn('npm', ['-v'], {stdio: 'inherit', shell: true})
    

๐Ÿ‘

By the way, I believe you can test process.platform === 'win32' without a regexp; The 64 bits versions of Windows are still part of the win32 platform, "64 bits" is just an extension of win32.

@UltCombo , I just learn this regexp from facebook/react-native@811079e which also trying to solve such spawn('npm'... ENOENT problem ๐Ÿ˜œ

bzoz commented

Is this still an issue? Does adding {shell:true} fixes it?

@bzoz I expressed this as 5 separate issues here #2318 (comment) so we might use those as tests cases if we wanted to know if {shell: true} fixes it.

bzoz commented

I did testing with current node, and:

I think we should close this issue and maybe have new ones opened for the remaining two items (grandchildren and additional features)

bzoz commented

I'm closing this issue. You can always reopen it if needed.

@bzoz when using shell:true you must escape everything manually, nodejs just joins the args and wraps them with quotes.

You should correct your comment because it misleads people into thinking escaping is done inside which isn't true.