andywer/threads.js

workers getting slow on every call

TrashUwU opened this issue · 5 comments

I use 3 workers to concurrently loop over a large array and return their levenshtein distance. They were fast at the beginning. But get slow on every function call.

//index.js
...
const app = express();
const { spawn, Thread, Worker } = require('threads');
const workers = [await spawn(new Worker("./work.js")), await spawn(new Worker("./work.js")), await spawn(new Worker("./work.js"))];

for (var i = 0; i < 3; i++) {
  const bigarray = [];
  for (var j = 0; j < 135000; j++) {
    bigarray.push(Math.random().toString());
  }
  workers[i].init(bigarray);
}

app.post('/', (req, res) => {
  const  query = req.body.query || Math.random().toString();
  const promises = [];
  for (const x of workers) {
   promises.push(x.result(query));
  }
  Promise.all(promises).then(p => res.send(JSON.stringify(p)));
});
//work.js

const { expose } = require("threads/worker");
const db = {};
const leven = require('leven');

expose({
  init(array) {
    db.data = array;
    return;
  },
  result(query) {
    const results = db.data.map(x => leven(x, query));
    return { bestMatch: Math.min.apply(null, results) };
  }
})

The above API gets 2 calls in 1000ms. The console also warns for reaching max event listeners.

Thanks for sharing, @TrashUwU. Don't see any mistakes on your end so far.
The max event listeners warning is annoying and should be fixed, but should be uncritical. I will think about it some more.

Maybe someone else has an idea or has witnessed something similar before?

Btw, we can rule out that the workers just get slower once the work starts piling up? Also… how much degradation of performance are we talking about here?

Workers usually take 500ms-3000ms at first. And then start to delay (5000ms, 6000ms....27000ms....190000ms). Yes, it happens when the work starts piling up.

Is there anything I can do about it? They take 500ms to return result or upto 200000ms (when work starts piling up) which is very inconsistent. Single thread is lot faster even if works start piling up and is consistent (will never take more than 10000ms)

Btw I can't use a worker pool (#312) because each worker has a unique array to search.

Hello???

@TrashUwU
There is an overhead in creating threads, in which case you should consider using a thread pool. 😄