Feature: spawn rather than fork

Question

Feature: spawn rather than fork

Opened this issue 4 years ago · 5 comments

That the moment workerpool will always use child_process.fork (when configured for child processes) - code.

That's perfect for child node processes, but is there any possibility of being able to use something other than node - e.g. just a bash script or something else that will support IPC. Such processes would be kicked off with child_process.spawn(). I could do a PR with a flag for that option, but I suspect that there might be a reason why it isn't in already?

Answer 1 · 2021-03-20T10:19:47.000Z

This library started out to have just an abstraction layer which allows you to create workers without having to bother too much about the underlying mechanics. So far there hasn't been any request for spawn for example. It can be very interesting though!

How would the interaction work with say a bash script? Can you give an example on how you see this work in practice?

Answer 2 · 2021-03-22T09:50:04.000Z

Hi - thank for the reply! I actually found a way to work around the original issue I was having - which was to use child processes in Nexe generated executable. For a while it was looking like I'd need to spawn a second Nexe generated executable, but in the end I found I could use fork. So the original requirement from me doesn't stand, and this can be safely closed, but in case anyone else might find it useful, I'd wondered above making another option available when calling setupProcessWorker that would let it call child_process.spawn() rather than child_process.fork()?

Answer 3 · 2021-03-31T08:33:20.000Z

Thanks for your feedback, good to hear you've found a solution for your case!

Yes we could consider this option, but I prefer awaiting an actual use case then. Let's close this issue for now and open it up again as soon as the need arises.

Answer 4 · 2023-05-04T10:38:03.000Z

Update to: #389

Hey, after spending some time this week on implementing my use-case and researching both exec and spawn, I realized this might not be possible.

I assumed exec and spawn behaved in the same way as fork, where unless you closed them manually, they continued to run, and could therefore be re-used. However, I believe this might not be the case. Someone more experienced could perhaps look at the Node docs, but from what I can tell, Node just runs the command and passes both execution and resource handling to the process you just called (and is completely hands-off), and it then just listens to the output. After the command has run, and Node receives the response, the child automatically closes (and it cannot be re-used?). This seems almost entirely true for exec (from what I can tell), but spawn seems to accept a stdin stream, so perhaps you can pass multiple inputs to it?

Like I said, I haven't looked that deeply into it, as I found the use of execs sufficient in my case. I just call them inside the thread created by workerpool. However, if spawns can be pooled, that would save a ton of resources.

Answer 5 · 2023-05-05T11:22:43.000Z

You are right. Both exec and spawn are to execute one task and then exit. This is quite far apart from workerpool which is about reusing workers to execute tasks. I can imagine that some manager on top of exec and spawn is useful to queue tasks and limit the number of processes, but it would be quite different from workerpool, and it is probably best to create a different library for just that purpose.

See also: https://stackoverflow.com/questions/48698234/node-js-spawn-vs-execute