mmomtchev/node-gdal-async

Implement per-Dataset I/O scheduling

mmomtchev opened this issue · 2 comments

There are currently some use cases which can lead to significantly degraded I/O performance through thread starvation or even blocking the event loop in async mode. ASYNCIO.md describes the steps needed to avoid these situations, but the user cannot be expected to understand the internals of the project as this defeats the point of having an abstraction layer on top of GDAL in the first place.

All of these problems can be solved by implementing per-Dataset I/O queues and replacing Nan::AsyncWorker with another implementation which schedules I/O operations.

This mechanism:

  • Must not eat a thread slot per Dataset as there can be much more Datasets than slots on the thread pool
  • Must be fair to avoid starvation - ie an application constantly reading from 5 Datasets on a default Node.js thread pool with 4 threads must read (almost) uniformly from all Datasets

The main challenge that must be solved to support per-Dataset task queues is the fact that queuing libuv work is possible only from the main thread.

The current queuing/multi-threading model of Node.js/Nan (or Node.js/N-API) implements a framework that allows to execute a task in a background thread (selected from a pool) and then to queue a callback on the event loop with the result. It also allows for sending of best-effort progress callbacks - that are run only if the main thread is idle at that moment. This background thread cannot schedule more libuv work and cannot continue dequeuing the dataset - it must wait for the main thread to schedule the next operation. Currently, one JS async function call = one libuv async task.

This framework is to be replaced with a new one that supports running multiple tasks (the dataset queues) with separate async contexts and then queuing the callbacks on the event loop to be run on the main thread, ie, multiple JS async function calls on the same dataset are to be executed by one libuv async task.

This depends on libuv/libuv#3429