Small package alternative to Execa
Closed this issue ยท 9 comments
Would it make sense for us to create a trimmed down version of Execa? As a separate repository (owned by you like Execa, if you want). With a focus on small package size instead of features. Under 100 lines of code.
Some of the "small package size" alternatives to Execa are not mentioning the key features that users are losing when opting for a smaller alternative (such as security, cross-platform support, debugging, etc.). By creating our own smaller alternative ("yoctoexeca"?), we could better warn users about those trade-offs, so they make an informed decision.
When quickly browsing the code of those smaller alternatives, I could also spot several bugs and issues that are known to us and fixed by Execa. But users might not spot those until they hit those bugs.
This situation is somewhat akin to what happened with Chalk, which led to the creation of Yoctocolors. This would be basically the same idea, but applied to Execa.
What are your thoughts on this @sindresorhus?
I already started work on this the other day, but didn't manage to finish yet because of travels.
This is amazing! ๐ ๐ ๐
If you need any help, please let me know.
Some potential things that could be done to keep things small:
- Re-using
util.promisify(child_process.execFile)
. It's not as good as the logic we have in Execa, but it's very short. - Trying to re-use/forward the options, subprocess and result of Node.js
child_process
, and just augment them.
Some features that are useful and can be implemented in not too much code:
- The
$
script interface - The template string format
- Local binaries
- Stripping the final newline
reject
optioninput
andinputFile
optionslines
option
Some features that might be left out, from a code size vs usefulness standpoint (unless they can be implemented in just a few lines)?
execaNode()
- Graceful termination
- Verbose mode
- Custom logging
- Transforms
- Interleaved output
- IPC
- Options binding
parseCommandString()
- All the
stdin
/stdout
/stderr
additional features (multiple targets, files, web streams) result.duration
.iterable()
,.readable()
,.writable()
,.duplex()
subprocess.kill(error)
result.escapedCommand
error.message
additions
The following features are harder, as they are quite useful, but harder to implement in less code:
maxBuffer
option- Line iteration
- Piping
cleanup
option- Synchronous execution
- Full Windows support
First iteration: https://github.com/sindresorhus/nano-spawn
It's wildly incomplete. Need to figure out some details before we continue.
Constraints:
- Only text (binary is an edge-case and people can use
execa
for that) - Only lines
- No Node.js streams (only iterables)
- No full Windows support (too much code and workarounds, should be fixed in Node.js itself instead)
Open question:
Should the spawn()
method itself be a promise or should it have const result = await subprocess.exited;
?
The former is obviously nicer, but I think I remember some issues with it doing that in execa
?
My thoughts. Please do argue if you disagree with something. You have more experience with subprocesses than me. I value your take on this.
Some features that are useful and can be implemented in not too much code:
- The
$
script interface
Same as below. This package is for reusable packages, and we should make that clear in the readme.
- The template string format
Nah. This package is mostly for reusable packages where using the array syntax doesn't matter. For scripts, execa
is the best choice.
- Local binaries
Maybe?
- Stripping the final newline
Already done because the output are just lines.
reject
option
No. I feel this was a mistake in execa
. It complicated the code and types for little gain.
input
andinputFile
options
Yes, but I would like to simplify it. Would be great to end up with a single option. Maybe one that accepts an iterable or async iterable.
lines
option
It's the default without opt-out. Not sure we need an option?
Some features that might be left out, from a code size vs usefulness standpoint (unless they can be implemented in just a few lines)?
execaNode()
Maybe? I think don't it's super important, but could be a nice to have. Let's see how many LOCs the other things take first.
- Graceful termination
Nah. Would require too much code.
- Verbose mode
Yes. Could be useful for debugging. Maybe a simpler implementation than in execa
.
- Custom logging
Nah, can be done in transforms.
- Transforms
Yes. Should not be hard since we only deal with lines of text.
- Interleaved output
It's the default.
- IPC
Out of scope.
- Options binding
Not sure. This is mostly useful for scripts, which is not the target. Thoughts?
parseCommandString()
Nah
- All the
stdin
/stdout
/stderr
additional features (multiple targets, files, web streams)
It needs some kind of piping, but I would want it to be as simple as possible. Maybe based on async iterables only?
result.duration
This could be useful in verbose
option, so we add verbose mode, we could just add this too.
.iterable()
,.readable()
,.writable()
,.duplex()
Nah. You can do all of that from an async iterable, which is already available.
subprocess.kill(error)
Yes
result.escapedCommand
Could be useful for debugging, but probably requires too much code.
error.message
additions
This could be useful.
The following features are harder, as they are quite useful, but harder to implement in less code:
maxBuffer
option
I'm not even sure why Node.js has this. If you cannot trust the binary you are executing, you already have lost.
- Line iteration
Already the default.
- Piping
A simplified interface for this could be useful.
cleanup
option
No
- Synchronous execution
Anti-pattern. Not important. We can defer to execa
.
- Full Windows support
Would require too much code. Let's see how far we get with the other more important things.
Should the spawn() method itself be a promise or should it have const result = await subprocess.exited;?
The former is obviously nicer, but I think I remember some issues with it doing that in execa?
Yes, being a Promise<Result>
is probably nicer. That's what most users seem to want. The main problem Execa has is merging the subprocess with that promise. However, we could expose it as a property instead. (see #413)
Making that return value iterable also sounds like a good idea. ๐
Only text (binary is an edge-case and people can use execa for that)
๐
Only lines
๐
No Node.js streams (only iterables)
๐
Graceful termination
Nah. Would require too much code.
๐
Custom logging
Nah, can be done in transforms.
๐
IPC
Out of scope.
๐
parseCommandString()
Nah
๐
.iterable(), .readable(), .writable(), .duplex()
Nah. You can do all of that from an async iterable, which is already available.
๐
result.escapedCommand
Could be useful for debugging, but probably requires too much code.
๐
maxBuffer option
I'm not even sure why Node.js has this. If you cannot trust the binary you are executing, you already have lost.
๐
Synchronous execution
Anti-pattern. Not important. We can defer to execa.
๐
reject option
No. I feel this was a mistake in execa. It complicated the code and types for little gain.
๐
execaNode()
Maybe? I think don't it's super important, but could be a nice to have. Let's see how many LOCs the other things take first.
๐
cleanup option
No
๐
The $ script interface
Same as below. This package is for reusable packages, and we should make that clear in the readme.
The template string format
Nah. This package is mostly for reusable packages where using the array syntax doesn't matter. For scripts, execa is the best choice.
Yes, that's a good take: only for typical libraries usage, as opposed to scripts, where Execa would be better instead.
Stripping the final newline
Already done because the output are just lines.
lines option
It's the default without opt-out. Not sure we need an option?
Line iteration
Already the default.
Nice idea. Just making everything line-wise.
Options binding
Not sure. This is mostly useful for scripts, which is not the target. Thoughts?
Yes, that's something that can easily be done by users themselves.
subprocess.kill(error)
Yes
That should actually be fairly small to implement. On the other hand, is it useful enough? The main difference with subprocess.kill()
is it sets the error
that's being thrown. This is something that can be emulated by users with a try
/catch
block.
error.message additions
This could be useful.
Maybe we can do a shortcut implementation. Execa's logic there is actually fairly big.
No full Windows support (too much code and workarounds, should be fixed in Node.js itself instead)
Full Windows support
Would require too much code. Let's see how far we get with the other more important things.
Yeah, that's a tough one, due to the amount of code that would be required for Windows. And that's a good point that Node.js should just fix some of those issues (even though they have notably not do so for many many years now).
We should have a clear understanding (and automated tests) of what is not working on Windows, so we're aware of the limitations.
input and inputFile options
Yes, but I would like to simplify it. Would be great to end up with a single option. Maybe one that accepts an iterable or async iterable.
Having a single option makes sense.
I wonder whether iterables would be needed for inputs (as opposed to outputs). For a subprocess, those would most likely be producing string chunks. For sync iterables, users can easily turn it into a single string using [...iterables].join('')
. For async iterables, Array.fromAsync()
can be used.
They would be useful mostly for incremental/progressive input (programmatic, not coming from an interactive terminal). Technically, users could still do something similar by calling subprocess.stdin.write()
, even though we'd want users to avoid streams. But overall, I wonder: how often do users need incremental programmatic inputs for a subprocess?
It seems to me the main types of input that users probably need are: none, string, file, inherit.
Verbose mode
Yes. Could be useful for debugging. Maybe a simpler implementation than in execa.
result.duration
This could be useful in verbose option, so we add verbose mode, we could just add this too.
The main parts are printing: the command line, the output, any error, the success/failure status, the duration.
This is very useful indeed. I am wondering how the code for it can be reduced. For example, printing the output might need to be skipped, since it's difficult to do that without changing the code's behavior. ๐ค
Also, users can implement most of it themselves by doing some console.log()
before/after the subprocess call. I wonder whether this might be a nice-to-have feature, best to leave for Execa instead?
Transforms
Yes. Should not be hard since we only deal with lines of text.
Wouldn't having access to the output as an iterable be enough? Users can then "map"/"reduce" over it. It's also fairly intuitive.
const output = [];
for await (const line of ...) {
output.push(mapLine(line));
}
Interleaved output
It's the default.
In most cases, stdout
carries the useful result, and stderr
is mostly empty except for error messages or things like progress lines. Would having the two mixed by default cause some issues? For example, if the subprocess outputs JSON on stdout
(but not stderr
) that the user intends to parse.
Based on this, it seems like it might important to still keep those separate?
That being said, it is true that merging both would simplify the API, because we could define only [Symbol.iterator]
instead of two explicit methods. Also, maybe the above comment is more relevant for the full result (result.stdout
/result.stderr
) that for the incremental/progressive result.
I.e. maybe stdout
/stderr
can be separate in the final result, but not when iterating?
One gotcha is that it requires merging iterables, which might increase implementation size. Note: I could be wrong but I think the current implementation here makes each line of stdout
waits for each line of stderr
(or the other way around) due to using Promise.all()
?
All the stdin/stdout/stderr additional features (multiple targets, files, web streams)
It needs some kind of piping, but I would want it to be as simple as possible. Maybe based on async iterables only?
Piping
A simplified interface for this could be useful.
The simplest way to implement piping would be to just do subprocess.stdout.pipe(subprocess.stdin)
. Using iterables would probably add more code IMHO. There are a few issues that Execa solves (for example, how EPIPE
is handled), but maybe there's a way to shortcut those?
What do you think?
Note: after your response, I'll break it down into separate issues in the new repository, to make the conversation clearer.
Yes, being a Promise is probably nicer. That's what most users seem to want. The main problem Execa has is merging the subprocess with that promise. However, we could expose it as a property instead. (see #413)
๐
That should actually be fairly small to implement. On the other hand, is it useful enough? The main difference with subprocess.kill() is it sets the error that's being thrown. This is something that can be emulated by users with a try/catch block.
Yeah, we can leave it out for now.
Maybe we can do a shortcut implementation. Execa's logic there is actually fairly big.
I didn't realize it was a big implementation. Maybe not worth it then. It's not essential, just nice to have.
Node.js should just fix some of those issues (even though they have notably not do so for many many years now).
We should really open new issues to restart old conversations. It would especially be useful if we can say that Bun or Deno already does it. They seem more willing to fix things that are missing in Node.js that exists in Bun or Deno.
I wonder whether iterables would be needed for inputs (as opposed to outputs).
It's mostly about memory, but yes, I don't think there is a common need.
It seems to me the main types of input that users probably need are: none, string, file, inherit.
๐
Also, users can implement most of it themselves by doing some console.log() before/after the subprocess call. I wonder whether this might be a nice-to-have feature, best to leave for Execa instead?
We can defer it and see the size of the package when we have added the essential stuff.
Wouldn't having access to the output as an iterable be enough? Users can then "map"/"reduce" over it. It's also fairly intuitive.
Indeed. Didn't think about that.
Based on this, it seems like it might important to still keep those separate?
It also exposes .stdout
and .stderr
iterables. It's just that the main promise is iterable on both.
I could be wrong but I think the current implementation here makes each line of stdout waits for each line of stderr (or the other way around) due to using Promise.all()?
Oh yeah, that is problably wrong.
The simplest way to implement piping would be to just do subprocess.stdout.pipe(subprocess.stdin). Using iterables would probably add more code IMHO. There are a few issues that Execa solves (for example, how EPIPE is handled), but maybe there's a way to shortcut those?
Yeah, we can always fall back to that. But would be nice to just have a simple way to pipe. Maybe we could take some inspiration from https://bun.sh/docs/api/spawn
๐ to all of the above! ๐
I have added issues in nano-spawn
, so we can close this.