why is it so hard to add threads to nodejs? or web workers?
p3x-robot opened this issue · 93 comments
i tried C++ async addon, and it took me 1 day. ( plus cancel a thread function)
why is it so hard to add a thread to nodejs?
Ref: nodejs/node#13143 (comment)
I believe it's not hard to implement, what might be hard is to close all the edge cases. It's just that it's a big code base that was written and tested as a single (execution) thread program.
IMHO That is why we are trying to understand what are the use-cases, and so will the cost be worth the benefit. Just as an example if a multi-process architecture can improve 80% of the use-cases for 20% of the cost, then that's a good investment of everybody's time 👍
but v8 already there or charka has web workers. isnt that all?
non blocking single threaded event loop and we just add in web workers which is already there via threads. we dont need anything else, like atomics, since it would be like web workers, closed and we can get data via events, the default as web workers work.
maybe later shared arrays etc... webassembly etc...
looks nothing.
add addon c++ is so simple async functions, but the 75% use case we dont even need c++, js is enough,
only for ai we might need horsepower like facial recognition or speech.
like what is the plus we need to implement when it is in chrome and charka already!
we dont need anything else, like atomics, since it would be like web workers
Well, having shared memory available would be a huge plus for Workers. You can already get acceptable event-based process coordination using process.on('message')
and process.send()
, so there are use cases already covered.
but v8 already there or charka has web workers. isnt that all?
If you want simple WebWorkers, yes, that would be comparatively easy. If you want to provide the full Node.js API (including things such as require()
), it gets tricky, because a lot of the current code assumes isolation by process, so we’d likely prefer a multi-process model. Implementing shared memory support for that would be tricky.
Also, just to give everybody a status update, I’ll try to draft an EP text this week.
i guess require
should be isloated and keep it simple like nodejs. the programmers can choose how to work with shared memory. todays we use multi servers/multi cores and can be left for the developers. myself, i use redis shared memory, but you can use a cluster or in 1 shared process and use the event based connumication.
so that can be tricky, that is can be done via NPM instead of ndoejs for shared memory by anyone else. :)
dont you think? once we have web workers, the shared memory can be implemented by many solutions, lik e express
. connect
, koa
etc...
as well, dont you think we should not try to implement right away? it will take a few versions and refactors to make it mature?
if isolated require
plus native web workers, we would have an awesome for like calculations.
why do you think IO should be not be including?
If you want simple WebWorkers, yes, that would be comparatively easy. If you want to provide the full Node.js API (including things such as require()), it gets tricky, because a lot of the current code assumes isolation by process, so we’d likely prefer a multi-process model. Implementing shared memory support for that would be tricky.
so we need the require
in isolation and we are done? the require('something')
will be not the same in a thread require('something')
, they will have a different memory and variables. so we are done?
for later we can add shareddata-s , atomics as well. but not everything right away :)
so we need the
require
in isolation and we are done?
That depends on what you mean by “in isolation”. What can you require
in Workers? Built-in modules? Not everything would work outright, like fs
, but right now require
depends on fs
working fine. Other things (e.g. everything in the v8
or vm
modules) should probably be available in some way anyway, using require
or not.
why do you think IO should be not be including?
I’m not saying it should not be included, I’m saying we might need a few tricks to get it to work. Or generally: The more of Node’s internals we want to expose, the more carefully one needs to think about what could go wrong.
as well, dont you think we should not try to implement right away? it will take a few versions and refactors to make it mature?
Because I’d like to make sure that we come up with an API that addresses everybody’s use cases, and doesn’t just take the path of least resistance. In particular, I would really not want to get us locked into a specific implementation and look back in a few months and have to say “yeah, this was a bad idea” (for example, it’s tempting to go for multi-thread support rather than multi-process support, but there are very valid points for not doing that).
Isn't just like, versions:
- default v8/charka web worker standard
- importScripts => require
- pass shared buffereds (like web worker api)
what other use cases are there?
why is require
fs
is tricky, when require
cache is in isolation, so different require for the main process and every thread has its own.
but v8 already there or charka has web workers. isnt that all?
Just to clarify my comment. The big codebase I was refering to is node
's. Even thought V8
and chakra
support MT, we have never tested (as MT) the codebase that wraps those engines and turns them into node
.
that's true, if we leave out require
, it works out of the box.
but if need definitely require
, then we have to care what we load with require
, i guess every thread should have it's own require
cache.
looks easy, but easier to say then do it.
but guys, i can see there is code going in. so you are writing already on it?
besides, process.nextTick can work like a thread, isn't it? threads are overrated.
Idea IntelliJ, 1 process, like 1000 threads and the indexing like 30 projects at once, the GUI is frozen, if they used the indexing would be a different process, it would never blocked the gui, but they dont like multiple processes, so idea is always blocked, even though i use it for everything (CLion, IntelliJ Ultimate)
I guess, web workers are good, but overrated if we use process.nextTick
The main problems are generally the idea that Worker
should contain require
and process
those are leaky abstractions tied to process level shared data (process.env/cwd and require.cache). If instead Worker relies on passing transferables and async communication it is pretty simple.
I don't think the hacky solutions to these problems are good and really hope to go the route of not trying to create light weight processes
isn't the Worker require and process would have their own? not shared at all ( but thread)
@p3x-robot you can't decouple them, they use OS level shared data
I don't understand the question.
@p3x-robot to a limited extent you can share data via transferring between threads, however some data like cwd and env come from the OS and are not thread local. For v8, you cannot share JS objects between Isolates/threads so you have to implement a layer like transferables. Even with that however syscalls like cwd are generally not thread safe to mutate which people do w/ standard Node APIs.
I am stating that certain Node APIs are not safe to reify in workers and workers must pass messages to the main thread in order to coordinate with them. This is similar to how the browser does not expose all the DOM APIs to workers due to threading issues.
Subscribing to this issue. Having this in Node.js would enable me to develop apps that leverage WebWorkers to offload CPU intensive operations (i.e Crypto) and make them work in the browser and Node.js in the same way. I would use it for js-ipfs and js-libp2p.
For now, there are so many issues, it will be so slow, I am just using processes, even Chrome uses a process for everything. So you can easy use https://www.npmjs.com/package/tiny-worker , check it out!
Although with C++ you can a lot as well. Lot's of examples for threads from NodeJs/ V8...
In order to move this forward, I think it might be good for people familiar with the internals to specify:
- Which API's definitely cannot be shared between threads (+ some explanation of why)
- Which API's would need some work (+ some explanation of what would be required)
- Which API's can easily be shared between threads
This way we can build up a list, discuss and converge on an outline of the end goal.
Food for thought: https://webkit.org/blog/7846/concurrent-javascript-it-can-work/
That comment is not in line with our code of conduct. Is like to kindly ask that you consider removing it.
Can we please keep comments in this thread relevant?
@p3x-robot @AngelAngelov personal attacks are considered a code of conduct offense, therefore i'd like to ask you two to remove or otherwise edit your comments
What happened? Comments appear to be removed.
@AngelAngelov I think https://github.com/Microsoft/napajs/blob/master/docs/api/node-api.md is will be full at some time an you can use threads at will without web-workers as well.
Merry XMAS
I wish you guys can take a look at here. https://github.com/alibaba/AliOS-nodejs/wiki/Workers-in-Node.js-based-on-multithreaded-V8
We propose a system for asynchronous multithreading with an API confirming to Web Worker standard in Node.js and maybe the performance improvements show that it's worth the effort.
@aaronleeatali do you know how I can build it? it is dated 2017. December, I am building it if it works with that date, we try to use the current version of NodeJs (at the moment 9.9.0)...
I am building it, with arm the original nodejs I could make with 9 threads, so far 9 threads couldn't build, now I am building with bare make
if this works, then it can be replaced with Nodejs, like new IONodeJs
, this AliOS
, of course only in my opinion is a bit not so beautiful, but of course, that's style, not a fact.
I cannot build it, I guess I would have to build some older GCC, but I don't want older, so I can't use it.
It would be cool, if it was the same build as NodeJs, as I can build NodeJs right away.
@aaronleeatali nodejs builds right away, and it is not really maintained, would be cool to make it works, because the idea is good, just i can't switch older GCC etc...
The original current build right away with make -j9
though multiple threads is not so important:
tubdata.a /home/patrikx3/Projects/node-v9.9.0/out/Release/obj.target/deps/v8/src/libv8_snapshot.a -ldl -lrt -lm -Wl,--end-group
rm df0b74446112ab5c8a7e9e1251814c22af28a6df.intermediate
if [ ! -r node -o ! -L node ]; then ln -fs out/Release/node node; fi
patrikx3@workstation:~/Projects/node-v9.9.0$ ./node -v
v9.9.0
@p3x-robot sorry, we have a plan to open the source, but it will take a few days. The implementation is not on the github currently.
@addaleax We want to let the CTC know that we may have some radical opinions to improve node.js performance and we have implemented some of them. Maybe the implementation is not elegant, we still want to give back to the community. We would be glad if it can bring some new ideas to the discussions and it's the pleasure if we could help in the later works of node.js.
@aaronleeatali on the topic of the changes alibaba made for startup with snapshots... nodejs/node#17058
@aaronleeatali thanks, it is a groundbreaking release once we can build it, it is fantastic, thanks.
@devsnek liqyan is my teammate and he sits beside me.
@aaronleeatali, @patrou is doing measurements to find/document use cases where different approaches (napa.js, workers, etc) provide a benefit. Once you have your implementation open sourced we should include that in the different options as well.
Hi @patrou,
I maintain a List of Parallel and Shared-Memory Javascript Designs and Implementations
and made a point to indicate the documented original intent for each implementation. Regrettably, my conclusion was Fibonacci sequence is the most important problem facing parallel JS.
Nothing would make me happier than to be shown I'm wrong, I am at your disposal.
I think use cases are parsing, hashing and encrypting large data objects which holds up the Node.js event queue. These tasks cannot be easily handed off to a separate process, because that would require parsing, too
Here you would want zero-copy threads for data transformations
However, once you have that, other downstream troubles like thread pooling and heap sizes are likely to occur
@p3x-robot @mhdawson @mogill @patrou, Sorry I didn't reply, I was very busy last month.
I will submit all the code this week.
@aaronleeatali well, if this works, many people will be happy, many people tried it, none are totally usable, i never use threads, but tons of people would use it, you would be like a god!
@p3x-robot Sorry, I am still waiting for the approval and It will take a few more days.
@refack , re "will the cost be worth the benefit"; It's definitely worth it considering that's the dealbreaker for folks who opt for Java. See nodejs/help#560 (comment).
@Pacerier java is a middle man, object oriented, VM...
there will always be JS forever!!!!
js is a little bit slower (with cluster it can even be faster than JAVA oftern!!!), but much easier to code with it (i dislike typescript for its slowless)
and then for speed there is C++
i praise this method!
you can't use 1 thing at once. try 2. besides with java you still need js on the backend.
of course some people dont like using 2 languages.
use java! 👍
i love c++ and js!!! 4EVER!
@p3x-robot, NodeJS wants to replace Java/C++/PHP/C/ASM/etc (other "server-side" environments). But it simply can't until it can do proper multi-processing.
that's true, bit check apache benchmarks vs nginx.
apache is thread based, nginx uses process workers and non blacking.
i think js is just for functional programming, and i am using farms, threads are just not usable, non blocking for files, db, in servers/inprocess.
i will never understand why threads is required
nodejs is not like c++, java, go, php, ruby.
nodejs is more like f#, ocaml
do you have to show a use case ? why can't you just use nodejs c++ addon???
so easy with n-api... i tired it with the bare addon, took 1 all day to create a promise based c++ addon, i would never use processing with js. and you can give shared memory from c++ to js.
isn't c++ the fasted??? if you don't need speed, javascript node is your best friend
its like c++ is a man, nodejs is a woman and you can connect.
for me this question is useless, i was thinking about it for a 1 year to understand and i will never use threads in nodejs even it they are there.
node-gyp is so easy to use and manage memory with c++ and pass to js in if you want just 1 server...
that's my thought.
https://github.com/alibaba/AliOS-nodejs/wiki/Lite-Thread-Worker
You can try thread worker here. It can run on arm-32bit or x64 in linux currently.
Further details will be provided later.
@addaleax
I heard that you have implemented web worker.
I wish you could take a look at our proposal and give some advices.
We aim to contribute to node.js and we still want to help about multithreading in node.js even our proposal would be rejected.
@aaronleeatali can you clarify the situation shared memory? As it's based off web workers are you planning to implement something like SharedArrayBuffer?
@aaronleeatali I think the core set of your changes to Node.js is in alibaba/AliOS-nodejs@b1e4b81, right? That actually seems very similar to what I’ve come up with in a few regards.
My own current set of changes is in https://github.com/addaleax/node/tree/worker, which I was planning to open a PR against nodejs/node with this week. Looking at it, it seems like some of the main differences are:
- Your changes don’t expose the whole Node.js API, whereas with mine you can
require()
all core modules and userland code. It’s not really aWebWorker
-style API in the Web sense, but that could be built on top of it.- This might also make it not quite as lightweight as what you are suggesting – that have to be benchmarked, I assume.
- I’ve included Node.js variants of the Web’s
MessageChannel
/MessagePort
APIs, and built my changes on top of that. It’s very close to the Web standard, and it does support transferringArrayBuffer
s, otherMessagePort
s and sharingSharedArrayBuffer
s.- It seems like you did something with very similar internals here, just inside V8. However, I am very unsure about the way you implemented
SharedArrayBuffer
transferring – there seems to bes missingsomething managing the lifetime of the underlying memory.
- It seems like you did something with very similar internals here, just inside V8. However, I am very unsure about the way you implemented
- You also have made significant changes against V8 for sharing data and code between threads. This is awesome, and would be another huge step forward for JS.
- Have you taken any measures to merge your changes into upstream V8? I am pretty sure that that would be a prerequisite for including any of your changes into Node.js.
Lastly, I don’t think that your proposal would be rejected in any way. It doesn’t seem mutually exclusive with the changes I’d like to make.
The dual heap (private/shared) design looks like a good choice, will there be a way for programs to explicitly indicate where an object should be stored? Similarly, will there be a way for parallel programs to help with scheduling Stop-The-World GC events to improve scaling?
Discussion
if we can use threads in node.js at will ...
is it going to have a cost of the speed of node.js as it is?
i am afraid some changes could make node.js slower because of threads.
it would be good, if it had no cost at zero.
Or only with a flag, that could give a cost /usr/bin/node --threads
or if there is a cost, i think it would not worth it, maybe a different build like /usr/bin/node-threads
node.js
is fast because there is no additional thread
and non blocking totally.
i hate java and c# by now, or typescript and react. i am a big fan of pure JavaScript
.
all threads
cost a context switch, right?
can we have a node.js with threads
and, if i do not use it, it will be the same speed as it is with node.js with no threads
???
can we have a
node.js with threads
and, if i do not use it, it will be the same speed as it is withnode.js with no threads
???
Yes. All necessary changes that could be expected to affect performance in single-threaded applications have already landed in Node.js, and haven’t shown any concerning impact so far. They are also very localized – unless your application is doing nothing else other than editing environment variables, you’ll be fine. These changes will most likely be released in Node 10.2.0.
I don’t know about @aaronleeatali’s work, but I wouldn’t expect it to have any significant impact on existing code either.
so, it will only use new threads, if i want a new thread, right? it will keep the singleton process and receive events, right?
because once you start using too many threads, the process will block itself, like Idea
, love it, but slow, because of threads
, this is where they could use an indexing process and the gui
would be another process, so the gui
would never be blocked.
look at chrome, the most stable ever. because of multiple processes.
so, if i understand right, it will only use new threads at will, node.js
will not start like 1000 threads because i have 20000 request, correct?
i love threads
, but with tons of care and at zero
cost. only, if i want a thread
.
thinking of me, thinking of too many things at once and i can never finish, because of this. blocking myself.
i am afraid of threads
. 👎
apache
slow, nginx
fast.
even php
is using fpm
(like multiple threads processes),
but,
nodejs
is faster, for some reason...
c#
and java
is the slowest.
why don't we just settle on c++
for computing? js
is not for computing... it is functional...
c++
is the best and easiest for the speed.
js
is the best and easiest for the functionalism.
there is no one
language... most people have at least: two
.
or if someone want to use only
one language
and it will be js
there must be a cost
for not node.js
, but who doesn't want to respect speed vs functionalism.
so, it will only use new threads, if i want a new thread, right? it will keep the singleton process and receive events, right?
Yes.
Also, regarding your other comments, two things: a) There’s also WebAssembly, which is nice because it means it’s no longer “JS or ” anyway, and b) let’s try to be productive and Node.js-focused here.
Yes, threads have downsides. It’s not a solution for all problems, but for some – Node.js won’t stop putting async-I/O first, but the downsides of blocking computation are real.
@addaleax I am interested in this topic a lot. I use Electron
- has web-workers
. I know web assembly
, very good. I created Node.js
async/await
gyp
add-on, best for computing.
Besides, I start to know a bit you guys and I think what you are working on Node.Js
(I wish, I had money to work only Node.Js
core), what you are telling me is that Node.Js
is going to be better and is not like Java
or C#
, but you are writing the future
for me as well.
HUGE PROPS GUYS FOR ALL THIS TO COMMUNICATE ABOUT THE BEST LANGUAGE IN THE DIMENSIONS!
good night
@addaleax a big question, sorry again from me.
Is it possible to change, by me, the cluster.schedulingPolicy
, so that I could add a new option, that could find out which has the least load
by the worker process load
?
That would be important, because it looks like the master
does not know anything about the load
in the worker
and could make our systems
faster.
@linusnorton @mogill
User can't control which object should be shared. I think the way you use worker is like the way you use actor in actor mode. We share internal things in V8, something like SharedFunctionInfo, String(primitive objects),etc. I think our changes to V8 is very suitable to support multithreading in actor-mode.
When you use worker, the only thing shared on user level is SharedArrayBuffer which is disabled in the newer version. Our work is on V8 6.1 and you can still use it.
I think the core set of your changes to Node.js is in alibaba/AliOS-nodejs@b1e4b81, right? That actually seems very similar to what I’ve come up with in a few regards.
Yes
Your changes don’t expose the whole Node.js API, whereas with mine you can require() all core modules and userland code. It’s not really a WebWorker-style API in the Web sense, but that could be built on top of it.
This might also make it not quite as lightweight as what you are suggesting – that have to be benchmarked, I assume.
Actually, user can use all core api of node.js in our implementation.
Every thread has a node env and event loop. If you don't want to replace V8 with our version, you can still use thread worker and use core api of the node.js in our worker. It will be relatively heavy and will take a long time when you start up the worker with node env(without node snapshot).
We changed V8 to share internal things between threads and we found that our changes on V8 can save 60%+ memory of a empty worker(with node env). The more you use js, the more memory you can save. And we can improve the startup speed of the worker(with node env) about 2X(13ms VS. 50+ ms).
It seems like you did something with very similar internals here, just inside V8. However, I am very unsure about the way you implemented SharedArrayBuffer transferring – there seems to bes missingsomething managing the lifetime of the underlying memory.
Yes, all the ArrayBuffer will be released before the process is over. We plan to handle it later.
You also have made significant changes against V8 for sharing data and code between threads. This is awesome, and would be another huge step forward for JS.
Have you taken any measures to merge your changes into upstream V8? I am pretty sure that that would be a prerequisite for including any of your changes into Node.js.
It's very difficult to merge our changes to V8 cause we haven't sync with V8 team. We will try to comminucate with V8 team later, I will write a design document in English first.
I haven't read the code of latest V8, but I found that newer V8(maybe 6.3) did the same thing with us. For example, Move thread data from "Shared Object" to "Private Object".(SharedFunctionInfo --> Feedback Vector)
@p3x-robot , Re "why can't you just use nodejs c++ addon"; You mean why not use Python/Java/C/ASM? Sure, but it aint JavaScript.
Re "i am a big fan of pure JavaScript"; Exactly.
Re "this question is useless, i was thinking about it for a 1 year to understand"; It's useful as needs for shared memory exist.
Re "some changes could make node.js slower because of threads"; This matters not; pure irrelevancy as speed needers thread.
Re ""chrome, the most stable ever. because of multiple processes""; Chrome has SharedArrayBuffer but not NodeJS, that's the whole point. Can NodeJS do SharedArrayBuffer?
Re "even php is"; PHP ain't used for writing databases and custom network stacks (cf Burpsuite); what's with "even"? Indeed, compare not w PHP but Java.
Re "just settle on c++ for computing"; Because NodeJS.
Re "no one language"; And the whole point of NodeJS is? Compete or extinct. If NodeJS won't, Java will, and that's my point. (Cf #4 (comment).)
@Pacerier thanks so much! now i see the light!!!!
i am creating benchmark on node v10 vs @aaronleeatali -s node 8.9 with light workers,
if this has no cost (i will not use threads), you @aaronleeatali is the god!!! and my precious nodejs
is perfect... 🥇
💯 🔢
👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍
Discussion
@addaleax i analysed that threads in nodejs, but i wish they never thought about threads. it will slow it down the whole v8, because mutexes, much slower. i rather live a slower but fast environment. besides, regex is slower? 3d ray tracing is faster? but there are cuda cores to use the work for us, the operation system can control the scheduling, even though you are in a gui thread, the other thread will block the whole process and nodejs is based on non-blocking 100% pure live. maybe with a flag a different build and v8 hack, i hope the creator of v8 using turbofan will not choose threads for the event loop. it will block.
for being able to use a thread, it will slow it down totally. memory is not always the best, logical can be better, it is a question though. but this solution is based on the event loop will be based on a thread instead of 1 event loop.
of course, people want threads, when we are scaling the cores from 1 to X and hyperthreads which is scheduled by the cpu. in nodejs, can we schedule the thread? if so, we can schedule threads in nodejs, then i guess then it could be good.
keep java, c#. nodejs is 100% non-blocking, what is the nature beautify of nodejs.
but after all, this is a discussion, i might be missing that the main event loop will have no mutexes 100% and will be equal with regex and ray tracing will be faster.
@p3x-robot I’m sorry but I’m really not sure what you’re saying here. The implementation at nodejs/node#20876 will give you the ability to schedule threads, and there are mutexes in place for some features but you shouldn’t count those as blocking, they will only be locked for a very short time.
@addaleax thanks for your info! you are right, a small mutex will not slow us down at all.
this is closed, given 10.5 is implemented for threads.
@addaleax does the threads works with Atomics? i am testing, just asking if you have any info on it... THANKS
@p3x-robot Yes, it does. :)
hey guys!
how are you?
do you have started using threads in NodeJs which has worker threads enabled?
what kind of use cases you are using?
thank you very much.
It would be great to have some rough numbers when it's worth it to use workers at all. e.g.: Is it possible to speed up functions that take only 10ms? Or more general information about the overhead that will occur when using workers.
I think the docs are really vague here.
Workers are useful for performing CPU-intensive JavaScript operations; do not use them for I/O, since Node.js’s built-in mechanisms for performing operations asynchronously already treat it more efficiently than Worker threads can.
Is it possible to speed up functions that take only 10ms seconds?
That depends – do you spin up a Worker for every invocation? No, that’s not going to help here, it would normally take longer than 10 ms to do so. But if you use shared memory, or at least MessagePorts, to communicate the tasks to a Worker and back? Then, yes, that might be very much possible.
Maybe it’s worth doing more advertisement for using a worker thread pool in the docs?
Or more general information about the overhead that will occur when using workers.
Fwiw, some reasons why no specific overhead measurements are mentioned in the docs are that this is going to depend on the actual machine the code is running on and we’re actively working on improvements re: improving startup time, both for Node.js itself and Workers.
this is going to depend on the actual machine the code is running on
Of course, but we could generate some performance information to give folks an indication of pros and cons. @addaleax Do you know whether this something the benchmarking WG is considering?
@davisjam Not that I know of. At this point the only benchmark we have is one for passing messages between workers (the MessageChannel
approach)…
Maybe it’s worth doing more advertisement for using a worker thread pool in the docs?
Yes good idea. Maybe an example would be nice too.
I feel a little scared to leave the safe harbor of single threads and I fear a big explosion of speed and memory leaks 💀
Thanks for your awesome work btw!
@addaleax 50%+ speedup for my algorithm! workers rock!
Loops needed ~5ms before workers now they need ~2ms on a 6 core machine. I'm using channels for messaging and promises to gather the results. Also I'm creating only (cpu cores - 1) workers for the logic so I have 1 free core to handle new requests etc. Does this make sense? Am I right that every worker gets his own V8 instance that gets optimized only for the functions that the worker handles?
Does this make sense? Am I right that every worker gets his own V8 instance that gets optimized only for the functions that the worker handles?
Yup – as far as V8/all JS stuff is concerned, every Worker is an independent instance. :)
I had problems with messages coming back from workers when the main function was called in a high frequency and the workers needed different amounts of time to return results. Classic async problem I guess. So old worker returns were smearing results into later called functions. I couldn't find any solution online (the simple subchannel example in node docs produces the same error). Guess this is an edge-case anyway but it produces surprisingly strange results.
My workaround is that I create a couple subchannels for each worker, send each worker it's ports and telling each worker on every call which port he should use to communicate, always iterating through the array of ports. Receiving specific function results on parent looks like this now:
worker.subChannels[worker.currentPort].port2.once("message", value => {});
I call it the "Ports-Merry-Go-Round". Not an official term yet? It's official now.
Had no problems after implementing that idea and once V8 is warmed up it's a beast on multi-core.
Was this "workaround" intended by you guys anyway? Is this a stupid idea? Pull no punches please.
@a1xon Just for clarification … is the issue that it’s not obvious how to tie requests to workers back to the responses if they can arrive out-of-order?
After a little bit later, i still think js is a functional language and not horse power. For parallism, it is a language C++ and below to assembly or video card. NodeJS will never be good for processing. It is like Unicorn NodeJs vs Atom Hydrogen Bomb Assembly. Totally different approaches.
@a1xon I’d say it isn’t a problem that’s unique to workers – adding something like a request ID to the passed message could help here, so that you don’t have to maintain multiple ports?
I (and this is really just a personal opinion) think this is a kind of problem that we’d only tackle with a built-in solution if we were to provide some kind of built-in worker pool?
@p3x-robot Keep in mind that JS can definitely perform on the same order of magnitude as native languages, and there are things like WebAssembly that have a significant impact on the performance-in-JS world as well.
Of course, just an opinion.Webassembly is awesome, just like add-on or at last, thread worker. I still have not found a good use case, we always use native bindings or native c/c++ based proceses. But what we have not found, it is coming...
adding something like a request ID to the passed message could help here
Agreed.
NodeJS will never be good for processing
I think the question is "Is it fast enough for my purposes, and is it the bottleneck of the system on which I am processing?"
@davisjam in fact, i use requestId a lot for redis or socket.io for passing one time events.
As for bottleneck, we use like imagemagick, or some c or c++ to use horse power, i wanted threads so much and still i cant need it at all. Of course the thread worker can use a different core so it is for who find a good use case it will be so awesome totally.
Though, theads can block itself if the thread is for some weird reason is on the same core, but thread workers is a 1099999% feature, wanted so much, now it is in my palm, HAPPY.
@addaleax @davisjam tried to do it with only ids in the first place. But in 1 of 1000 cases the worker were called 2 or 3 times with the same random generated id. So I switched to changing ports and it works like a charm now - I'm also using IDs at the moment. Are you interested in the code? I can try to pretty it up a little.
@p3x-robot I am developing on my own and node.js + V8 is a huge blessing for me. The speed I can achieve now with workers is a lot higher than what I could achieve with naively written C/C++ code. The JIT Optimizers are way better than my C/C++ skills.
It is huge information that you tell us, because i am sure i will face this use case at some point. Thanks so much.
@a1xon Do you think a counter or something like that would work, to avoid collisions? But either way, as long as you found something that works… :)
You can feel free to share code if you think it contains feedback that we can apply to the Workers implementation, or that could be looked at for developing benchmarks/tests/etc. if you think that makes sense.
worker were called 2 or 3 times with the same random generated id
@addaleax sorry that was missleading.
the workers returned their results a couple of times - They were only called once. It was difficult for me keeping track of the current function results while not binning the old ones. Will open a small repo and push the code later. thanks for your help :)
Just uploaded the code. Hope I didn't strip too much from it.
https://github.com/a1xon/Node-Workers-Example
Does this look reasonable?
@a1xon I’d be a bit wary of using Math.random()
to generate IDs… it might do fine, but it’s probably safer (and a bit easier?) to use a counter, like this:
let counter = 0; // Use 0n if you want to be absolutely sure and use BigInt instead
const main = async someArgument => {
…
let requestID = ++counter;
…
};
That way you can avoid collisions and don’t need to plan for them, even if they come with a very low frequency.
If you need a shared counter for p2p workers without going through a coordination thread you can also create a SharedArrayBuffer and pass it around, and anyone wishing to use it can lock the structure then increment the counter.