Feature: spawn rather than fork
Opened this issue · 5 comments
That the moment workerpool will always use child_process.fork
(when configured for child processes) - code.
That's perfect for child node processes, but is there any possibility of being able to use something other than node - e.g. just a bash script or something else that will support IPC. Such processes would be kicked off with child_process.spawn()
. I could do a PR with a flag for that option, but I suspect that there might be a reason why it isn't in already?
This library started out to have just an abstraction layer which allows you to create workers without having to bother too much about the underlying mechanics. So far there hasn't been any request for spawn
for example. It can be very interesting though!
How would the interaction work with say a bash script? Can you give an example on how you see this work in practice?
Hi - thank for the reply! I actually found a way to work around the original issue I was having - which was to use child processes in Nexe generated executable. For a while it was looking like I'd need to spawn
a second Nexe generated executable, but in the end I found I could use fork. So the original requirement from me doesn't stand, and this can be safely closed, but in case anyone else might find it useful, I'd wondered above making another option available when calling setupProcessWorker
that would let it call child_process.spawn()
rather than child_process.fork()
?
Thanks for your feedback, good to hear you've found a solution for your case!
Yes we could consider this option, but I prefer awaiting an actual use case then. Let's close this issue for now and open it up again as soon as the need arises.
Update to: #389
Hey, after spending some time this week on implementing my use-case and researching both exec
and spawn
, I realized this might not be possible.
I assumed exec
and spawn
behaved in the same way as fork
, where unless you closed them manually, they continued to run, and could therefore be re-used. However, I believe this might not be the case. Someone more experienced could perhaps look at the Node docs, but from what I can tell, Node just runs the command and passes both execution and resource handling to the process you just called (and is completely hands-off), and it then just listens to the output. After the command has run, and Node receives the response, the child automatically closes (and it cannot be re-used?). This seems almost entirely true for exec
(from what I can tell), but spawn
seems to accept a stdin
stream, so perhaps you can pass multiple inputs to it?
Like I said, I haven't looked that deeply into it, as I found the use of exec
s sufficient in my case. I just call them inside the thread created by workerpool
. However, if spawn
s can be pooled, that would save a ton of resources.
You are right. Both exec
and spawn
are to execute one task and then exit. This is quite far apart from workerpool
which is about reusing workers to execute tasks. I can imagine that some manager on top of exec
and spawn
is useful to queue tasks and limit the number of processes, but it would be quite different from workerpool
, and it is probably best to create a different library for just that purpose.
See also: https://stackoverflow.com/questions/48698234/node-js-spawn-vs-execute