sindresorhus/execa

Small package alternative to Execa

Closed this issue ยท 9 comments

Would it make sense for us to create a trimmed down version of Execa? As a separate repository (owned by you like Execa, if you want). With a focus on small package size instead of features. Under 100 lines of code.

Some of the "small package size" alternatives to Execa are not mentioning the key features that users are losing when opting for a smaller alternative (such as security, cross-platform support, debugging, etc.). By creating our own smaller alternative ("yoctoexeca"?), we could better warn users about those trade-offs, so they make an informed decision.

When quickly browsing the code of those smaller alternatives, I could also spot several bugs and issues that are known to us and fixed by Execa. But users might not spot those until they hit those bugs.

This situation is somewhat akin to what happened with Chalk, which led to the creation of Yoctocolors. This would be basically the same idea, but applied to Execa.

What are your thoughts on this @sindresorhus?

I already started work on this the other day, but didn't manage to finish yet because of travels.

This is amazing! ๐ŸŽ‰ ๐ŸŽ‰ ๐ŸŽ‰

If you need any help, please let me know.

Some potential things that could be done to keep things small:

  • Re-using util.promisify(child_process.execFile). It's not as good as the logic we have in Execa, but it's very short.
  • Trying to re-use/forward the options, subprocess and result of Node.js child_process, and just augment them.

Some features that are useful and can be implemented in not too much code:

  • The $ script interface
  • The template string format
  • Local binaries
  • Stripping the final newline
  • reject option
  • input and inputFile options
  • lines option

Some features that might be left out, from a code size vs usefulness standpoint (unless they can be implemented in just a few lines)?

  • execaNode()
  • Graceful termination
  • Verbose mode
  • Custom logging
  • Transforms
  • Interleaved output
  • IPC
  • Options binding
  • parseCommandString()
  • All the stdin/stdout/stderr additional features (multiple targets, files, web streams)
  • result.duration
  • .iterable(), .readable(), .writable(), .duplex()
  • subprocess.kill(error)
  • result.escapedCommand
  • error.message additions

The following features are harder, as they are quite useful, but harder to implement in less code:

  • maxBuffer option
  • Line iteration
  • Piping
  • cleanup option
  • Synchronous execution
  • Full Windows support

First iteration: https://github.com/sindresorhus/nano-spawn

It's wildly incomplete. Need to figure out some details before we continue.


Constraints:

  • Only text (binary is an edge-case and people can use execa for that)
  • Only lines
  • No Node.js streams (only iterables)
  • No full Windows support (too much code and workarounds, should be fixed in Node.js itself instead)

Open question:

Should the spawn() method itself be a promise or should it have const result = await subprocess.exited;?

The former is obviously nicer, but I think I remember some issues with it doing that in execa?

My thoughts. Please do argue if you disagree with something. You have more experience with subprocesses than me. I value your take on this.


Some features that are useful and can be implemented in not too much code:

  • The $ script interface

Same as below. This package is for reusable packages, and we should make that clear in the readme.

  • The template string format

Nah. This package is mostly for reusable packages where using the array syntax doesn't matter. For scripts, execa is the best choice.

  • Local binaries

Maybe?

  • Stripping the final newline

Already done because the output are just lines.

  • reject option

No. I feel this was a mistake in execa. It complicated the code and types for little gain.

  • input and inputFile options

Yes, but I would like to simplify it. Would be great to end up with a single option. Maybe one that accepts an iterable or async iterable.

  • lines option

It's the default without opt-out. Not sure we need an option?

Some features that might be left out, from a code size vs usefulness standpoint (unless they can be implemented in just a few lines)?

  • execaNode()

Maybe? I think don't it's super important, but could be a nice to have. Let's see how many LOCs the other things take first.

  • Graceful termination

Nah. Would require too much code.

  • Verbose mode

Yes. Could be useful for debugging. Maybe a simpler implementation than in execa.

  • Custom logging

Nah, can be done in transforms.

  • Transforms

Yes. Should not be hard since we only deal with lines of text.

  • Interleaved output

It's the default.

  • IPC

Out of scope.

  • Options binding

Not sure. This is mostly useful for scripts, which is not the target. Thoughts?

  • parseCommandString()

Nah

  • All the stdin/stdout/stderr additional features (multiple targets, files, web streams)

It needs some kind of piping, but I would want it to be as simple as possible. Maybe based on async iterables only?

  • result.duration

This could be useful in verbose option, so we add verbose mode, we could just add this too.

  • .iterable(), .readable(), .writable(), .duplex()

Nah. You can do all of that from an async iterable, which is already available.

  • subprocess.kill(error)

Yes

  • result.escapedCommand

Could be useful for debugging, but probably requires too much code.

  • error.message additions

This could be useful.

The following features are harder, as they are quite useful, but harder to implement in less code:

  • maxBuffer option

I'm not even sure why Node.js has this. If you cannot trust the binary you are executing, you already have lost.

  • Line iteration

Already the default.

  • Piping

A simplified interface for this could be useful.

  • cleanup option

No

  • Synchronous execution

Anti-pattern. Not important. We can defer to execa.

  • Full Windows support

Would require too much code. Let's see how far we get with the other more important things.

Should the spawn() method itself be a promise or should it have const result = await subprocess.exited;?
The former is obviously nicer, but I think I remember some issues with it doing that in execa?

Yes, being a Promise<Result> is probably nicer. That's what most users seem to want. The main problem Execa has is merging the subprocess with that promise. However, we could expose it as a property instead. (see #413)

Making that return value iterable also sounds like a good idea. ๐Ÿ‘


Only text (binary is an edge-case and people can use execa for that)

๐Ÿ‘

Only lines

๐Ÿ‘

No Node.js streams (only iterables)

๐Ÿ‘

Graceful termination

Nah. Would require too much code.

๐Ÿ‘

Custom logging

Nah, can be done in transforms.

๐Ÿ‘

IPC

Out of scope.

๐Ÿ‘

parseCommandString()

Nah

๐Ÿ‘

.iterable(), .readable(), .writable(), .duplex()

Nah. You can do all of that from an async iterable, which is already available.

๐Ÿ‘

result.escapedCommand

Could be useful for debugging, but probably requires too much code.

๐Ÿ‘

maxBuffer option

I'm not even sure why Node.js has this. If you cannot trust the binary you are executing, you already have lost.

๐Ÿ‘

Synchronous execution

Anti-pattern. Not important. We can defer to execa.

๐Ÿ‘

reject option

No. I feel this was a mistake in execa. It complicated the code and types for little gain.

๐Ÿ‘

execaNode()

Maybe? I think don't it's super important, but could be a nice to have. Let's see how many LOCs the other things take first.

๐Ÿ‘

cleanup option

No

๐Ÿ‘

The $ script interface

Same as below. This package is for reusable packages, and we should make that clear in the readme.

The template string format

Nah. This package is mostly for reusable packages where using the array syntax doesn't matter. For scripts, execa is the best choice.

Yes, that's a good take: only for typical libraries usage, as opposed to scripts, where Execa would be better instead.

Stripping the final newline

Already done because the output are just lines.

lines option

It's the default without opt-out. Not sure we need an option?

Line iteration

Already the default.

Nice idea. Just making everything line-wise.

Options binding

Not sure. This is mostly useful for scripts, which is not the target. Thoughts?

Yes, that's something that can easily be done by users themselves.


subprocess.kill(error)

Yes

That should actually be fairly small to implement. On the other hand, is it useful enough? The main difference with subprocess.kill() is it sets the error that's being thrown. This is something that can be emulated by users with a try/catch block.

error.message additions

This could be useful.

Maybe we can do a shortcut implementation. Execa's logic there is actually fairly big.

No full Windows support (too much code and workarounds, should be fixed in Node.js itself instead)

Full Windows support

Would require too much code. Let's see how far we get with the other more important things.

Yeah, that's a tough one, due to the amount of code that would be required for Windows. And that's a good point that Node.js should just fix some of those issues (even though they have notably not do so for many many years now).

We should have a clear understanding (and automated tests) of what is not working on Windows, so we're aware of the limitations.

input and inputFile options

Yes, but I would like to simplify it. Would be great to end up with a single option. Maybe one that accepts an iterable or async iterable.

Having a single option makes sense.

I wonder whether iterables would be needed for inputs (as opposed to outputs). For a subprocess, those would most likely be producing string chunks. For sync iterables, users can easily turn it into a single string using [...iterables].join(''). For async iterables, Array.fromAsync() can be used.

They would be useful mostly for incremental/progressive input (programmatic, not coming from an interactive terminal). Technically, users could still do something similar by calling subprocess.stdin.write(), even though we'd want users to avoid streams. But overall, I wonder: how often do users need incremental programmatic inputs for a subprocess?

It seems to me the main types of input that users probably need are: none, string, file, inherit.

Verbose mode

Yes. Could be useful for debugging. Maybe a simpler implementation than in execa.

result.duration

This could be useful in verbose option, so we add verbose mode, we could just add this too.

The main parts are printing: the command line, the output, any error, the success/failure status, the duration.

This is very useful indeed. I am wondering how the code for it can be reduced. For example, printing the output might need to be skipped, since it's difficult to do that without changing the code's behavior. ๐Ÿค”

Also, users can implement most of it themselves by doing some console.log() before/after the subprocess call. I wonder whether this might be a nice-to-have feature, best to leave for Execa instead?

Transforms

Yes. Should not be hard since we only deal with lines of text.

Wouldn't having access to the output as an iterable be enough? Users can then "map"/"reduce" over it. It's also fairly intuitive.

const output = [];

for await (const line of ...) {
  output.push(mapLine(line));
}

Interleaved output

It's the default.

In most cases, stdout carries the useful result, and stderr is mostly empty except for error messages or things like progress lines. Would having the two mixed by default cause some issues? For example, if the subprocess outputs JSON on stdout (but not stderr) that the user intends to parse.

Based on this, it seems like it might important to still keep those separate?

That being said, it is true that merging both would simplify the API, because we could define only [Symbol.iterator] instead of two explicit methods. Also, maybe the above comment is more relevant for the full result (result.stdout/result.stderr) that for the incremental/progressive result.

I.e. maybe stdout/stderr can be separate in the final result, but not when iterating?

One gotcha is that it requires merging iterables, which might increase implementation size. Note: I could be wrong but I think the current implementation here makes each line of stdout waits for each line of stderr (or the other way around) due to using Promise.all()?

All the stdin/stdout/stderr additional features (multiple targets, files, web streams)

It needs some kind of piping, but I would want it to be as simple as possible. Maybe based on async iterables only?

Piping

A simplified interface for this could be useful.

The simplest way to implement piping would be to just do subprocess.stdout.pipe(subprocess.stdin). Using iterables would probably add more code IMHO. There are a few issues that Execa solves (for example, how EPIPE is handled), but maybe there's a way to shortcut those?


What do you think?

Note: after your response, I'll break it down into separate issues in the new repository, to make the conversation clearer.

Yes, being a Promise is probably nicer. That's what most users seem to want. The main problem Execa has is merging the subprocess with that promise. However, we could expose it as a property instead. (see #413)

๐Ÿ‘

That should actually be fairly small to implement. On the other hand, is it useful enough? The main difference with subprocess.kill() is it sets the error that's being thrown. This is something that can be emulated by users with a try/catch block.

Yeah, we can leave it out for now.

Maybe we can do a shortcut implementation. Execa's logic there is actually fairly big.

I didn't realize it was a big implementation. Maybe not worth it then. It's not essential, just nice to have.

Node.js should just fix some of those issues (even though they have notably not do so for many many years now).

We should really open new issues to restart old conversations. It would especially be useful if we can say that Bun or Deno already does it. They seem more willing to fix things that are missing in Node.js that exists in Bun or Deno.

I wonder whether iterables would be needed for inputs (as opposed to outputs).

It's mostly about memory, but yes, I don't think there is a common need.

It seems to me the main types of input that users probably need are: none, string, file, inherit.

๐Ÿ‘

Also, users can implement most of it themselves by doing some console.log() before/after the subprocess call. I wonder whether this might be a nice-to-have feature, best to leave for Execa instead?

We can defer it and see the size of the package when we have added the essential stuff.

Wouldn't having access to the output as an iterable be enough? Users can then "map"/"reduce" over it. It's also fairly intuitive.

Indeed. Didn't think about that.

Based on this, it seems like it might important to still keep those separate?

It also exposes .stdout and .stderr iterables. It's just that the main promise is iterable on both.

I could be wrong but I think the current implementation here makes each line of stdout waits for each line of stderr (or the other way around) due to using Promise.all()?

Oh yeah, that is problably wrong.

The simplest way to implement piping would be to just do subprocess.stdout.pipe(subprocess.stdin). Using iterables would probably add more code IMHO. There are a few issues that Execa solves (for example, how EPIPE is handled), but maybe there's a way to shortcut those?

Yeah, we can always fall back to that. But would be nice to just have a simple way to pipe. Maybe we could take some inspiration from https://bun.sh/docs/api/spawn

๐Ÿ‘ to all of the above! ๐Ÿ‘

I have added issues in nano-spawn, so we can close this.