nicolas-van/modern-async

Question about Synchronous Long Executions and Code breaking

Closed this issue ยท 6 comments

Dear Nicolas and contributors, I have a question about blocking sync operations.

I have a long array, lets say for example 10000 in length. And I have to execute simple functions such as running a multiplication or simply squaring values and manipulating the objects inside of this array.

On a web server the main goal is to avoid blocking code at all costs. So the alternatives available are:

  1. Use a for loop THIS WILL BLOCK EXECUTION UNTIL DONE! NOT A SOLUTION.

  2. Use a .map / .forEach on the array to execute the sync function on each element.

It does not block the event loop, however I am not sure of the impact on the execution as it would be thousands of sync functions piling.

  1. Use Modern Async with Limit this way, the sync functions would be transformed in async and setting the limit for example 1 Promise at a time will permit the code to act as one big promise that will allow for each execution of the sync function for the event loop to run and other functions or requests to also be executed and queued without blocking any part of the code and running as many functions as possible when possible.

This method in my opinion might be slight less performant because of all the overhead of promises and callbacks, but it would permit the breaking of a long iterator into small pieces pieces with small intervals to prevent the blocking of the execution.

Please let me know with your knowledge what would you do ? Or if I am wrong about sync .map .forEach

Thanks
Pedro

Hello,

First of all it should be noted that using async function does not prevent you from blocking the event queue. Using async/await do not inject events in the event queue, they inject them in the microtask queue (see here: https://developer.mozilla.org/en-US/docs/Web/API/HTML_DOM_API/Microtask_guide ).

So, as example, this infinite loop triggering async function calls will effectively block the event loop and completely brick your process (you can try it in the dev console of your browser if you want, it will brick your current tab):

async function doSomething() {}

;(async () => {
  while (true) {
    await doSomething()
  }
})()

The only ways for an async function to have its followup be triggered in a later tick of the event loop is to:

  • Perform some non-blocking IO (http call or filesystem usually)
  • Use setTimeout()

It should also be noted that modern-async is designed to use the microtask queue as much as possible, because it is way faster. I can ensure you that in latest version all functions that could possibly be coded using the microtask queue instead of the event queue were coded that way. (It was not the case in earlier versions.)

So whether you declare async functions, use await, use Array.map(), use modern-async.map() or all of that combined, it will still block your current process as long as you're not using setTimeout(), directly or indirectly.

Now that this is said, there is a helper in modern-async that is specifically designed for your use case: https://nicolas-van.github.io/modern-async/modern-async/1.1.0/Delayer.html

By the way I just figured that the documentation of Delayer is kind of wrong. That's not the exact way it performs. I will correct that.

Ok new description for the Delayer class:

 * A class used to spread time or cpu intensive operations on multiple tasks in the event loop in order
 * to avoid blocking other tasks that may need to be executed.
 *
 * It is configured with a trigger time, which represents the maximum amount of time your tasks should
 * monopolize the event loop. Choosing an appropriate trigger time is both important and hard. If too low
 * it will impact the performances of your long running algorithm. If too high it will impact the other
 * tasks that need to run in the event loop.
 *
 * When using Delayer your code should contain frequent calls to `await delayer.checkDelay()`, usually
 * at the end of every loop. `checkDelay()` will check the amount of time that ellasped since the last time
 * it triggered a new task in the event loop. If the amount of time is below the trigger time it returns
 * an already resolved promise and the remaining computation will be able to continue processing in a
 * microtask. If not it will call the `delay()` function that will retrigger the operation in a later task
 * of the event loop.
`

Dear Nicolas, thanks for the amazing response and support.

From What I have learned from you here is a solution that I made that can be used ( in my opinion to resole this issue)

function * nonBlockingGenerator(array, func, done){
    for(let e of array){
        // console.log(e)
        yield func(e);
    }
    done()
}

const nonBlockingForEach = (array, func) => {
    return new Promise((resolve, reject) => {
        const g = nonBlockingGenerator(array, func, resolve);
        const execution  =  () => {
            setTimeout(() => {
                g.next()
                execution();
            })
        }
        execution();
    })
}

const veryLargeArray = Array(10000).fill(1);
const func = (value) => {
    // syncronous  yet simple operations 
    return value;
}


let intervals = 0;
const interval = setInterval( () => {
    intervals++;
},0)

const start = new Date()
nonBlockingForEach(veryLargeArray, func).then(() => {
    console.log("done", intervals, "Time takens MS: ", new Date() - start, "ms per exec:" ,  (new Date() - start)/veryLargeArray.length)
    clearInterval(interval)
    return;
})

Took almost 1 ms per execution very slow
Screen Shot 2022-06-30 at 11 03 29 AM

Changing setTimeout with setImmediate (MUCH MUCH FASTER, AVG 0.0205 PER EXECUTION)
Screen Shot 2022-06-30 at 11 03 47 AM

Please let me know what you think of my solution.

Or should I just use the delayer inside a for loop or a .map or .forEach?

I did some more experimenting with the code above.

EXAMPLES BELOW IS USING setImmediate

/// [....code above ] const veryLargeArray = 

Promise.all([
    nonBlockingForEach(veryLargeArray, func),
    nonBlockingForEach(veryLargeArray, func),
    nonBlockingForEach(veryLargeArray, func),
    nonBlockingForEach(veryLargeArray, func)
]).then(() => {
    clearInterval(interval)
    console.table({
        ops,
        intervals,
        "Time Taken" :  new Date() - start,
        "time per exec" : (new Date() - start)/veryLargeArray.length*4,
    })

})

WHEN DONE IN PARARLLEL 100000 items in 4 promises
Screen Shot 2022-06-30 at 12 14 14 PM

WHEN DONE IN SERIES AS THE FIRST EXAMPLE 40000 items in 1 promise
Screen Shot 2022-06-30 at 12 15 55 PM

As I explained previously, you should just use Delayer, that's what it' made for.

I close this issue because I don't think it will go much further.