d3/d3-queue

Consider adding queue.deferImmediate.

natevw opened this issue · 5 comments

[Original subject: Syncronous queue.await triggering considered harmful]

In reference to the comment on #4, I would like to cast my vote against ever triggering the await/awaitAll callback synchronously.

While there is some discussion to the contrary, it is generally considered a bad practice to have this sort of inconsistency. One example of its harm is given under Keeping callbacks truly asynchronous.

Thinking of calling this library alone, the inconsistency doesn't seem so harmful — although there have been times I've felt my code could have been cleaner if I'd been able to define the queue and its error handling before starting to defer tasks.

However, if I want to use this library and still provide my callers with guaranteed asyncronocity, then the problem is more apparent:

function doThings(cb) {
    var q = queue(1).await(cb);
    if (maybe1) q.defer(thing1);
    if (maybe2) q.defer(thing2);
}

becomes

function doThings(cb) {
    var q = queue(1);
    if (maybe1) q.defer(thing1);
    if (maybe2) q.defer(thing2);
    if (maybe1 || maybe2) q.await(cb);
    else setTimeout(cb, 0);
}

Of course, hardly an hour after writing this I go and write some "batch via one queue while limiting parallelism via another" code that really wants to trust that a .defer call of an infinitely parallel queue triggers its method synchronously. Le sigh.

var batch = queue();
d.rows.forEach(function (row) {
    var batch_cb;
    batch.defer(function (cb) { batch_cb = cb; });
    q.defer(function (d, q_cb) {
        map(d, function () {
            batch_cb.apply(this, arguments);
            q_cb.apply(this, arguments);
        });
    }, row.doc || row);
});
batch.awaitAll(function (e, docs) {
    if (reduce && !e) q.defer(reduce, docs);
});

EDIT 2013/April/3: I realize now that this example is not necessarily contradictory to the request and probably led to some of the confusion. I'd consider it correct to start the deferred work as soon as the concurrency level permits, its when the .awaitAll happens that matters. See below for easy workaround.

#4 has been fixed; as of 7bd5059 (1.0.2), pop is no longer recursive.

In your example, you must call await after all of your defers; otherwise, you’d invoke the callback immediately.

var q = queue(1);
if (test1) q.defer(task1);
if (test2) q.defer(task2);
q.await(callback);

I could see it being desirable in some cases to use defer asynchronously. You can do that with a helper. For example, in Node you could use setImmediate:

var q = queue(1);
if (test1) q.defer(function(callback) { setImmediate(function() { task1(callback); }); });
if (test2) q.defer(function(callback) { setImmediate(function() { task2(callback); }); });
q.await(callback);

If you wanted to reduce that duplication a bit:

function immediate(method, callback) {
  var argv = [].slice.call(arguments, 1).concat(callback);
  setImmediate(function() { method.apply(null, argv); });
}

Then:

var q = queue(1);
if (test1) q.defer(immediate, task1);
if (test2) q.defer(immediate, task2);
q.await(callback);

I could see maybe doing a queue.deferImmediate method, with the same basic idea:

queue.deferImmediate = function(method) {
  var argv = slice.call(arguments, 1);
  return queue.defer(function(callback) {
    argv.push(callback);
    setImmediate(function() { method.apply(null, argv); });
  });
};

I’m just not sure if it’s worth the additional API surface area.

Closing this for now, but I’ll reopen if you want to discuss further.

I'm not quite sure we're talking about quite the same thing. I'm fine with .defer starting execution immediately and am not requesting a deferImmediate — in fact, almost the opposite.

The bug/feature is that .await/.awaitAll may or may not trigger its "event" asynchronously — depending on whether the queue currently has items in it. I don't need this library to workaround other codes' premature callbacks, I wanted it to stop firing premature callbacks itself!

However, I (reluctantly) agree that this synchronous behaviour has its uses. Given that .await does not support multiple "event handler" registration anyway (e.g. you can't pass a queue around and let others listen for depletion events, i.e. it is (arguably) not intended to work like the event registration APIs I am comparing it to), no great harm comes from leaving it as-is — for all supported usage patterns the caller will always have enough information to know whether calling await is safe or not for its own purposes.

Finally, for posterity, I would actually recommend dealing with this by adding a known-async task to any queue where correct async behaviour is important, rather than the mess above:

var workaroundTask = (typeof process === 'object' && process.nextTick) || (typeof window === 'object' && window.setImmediate) || setTimeout;

function doThings(cb) {
    var q = queue(1).defer(workaroundTask).await(cb);
    // now we can do whatever and trust that our caller's code won't get triggered earlier than they expect
    if (maybe1) q.defer(thing1);
    if (maybe2) q.defer(thing2);
}

I see, but still, you’re not supposed to defer tasks after setting the await callback. If you want to guarantee that your callback is asynchronous, you should say:

var q = queue(1);
if (maybe1) q.defer(thing1);
if (maybe2) q.defer(thing2);
q.defer(setImmediate).await(callback);

You could modify what I proposed earlier to make deferImmediate not take any arguments for the same effect:

queue.deferImmediate = function() {
  return queue.defer(setImmediate);
};

And then your code is slightly shorter:

var q = queue(1);
if (maybe1) q.defer(thing1);
if (maybe2) q.defer(thing2);
q.deferImmediate().await(callback);

But you might want to choose between setImmediate, process.nextTick, requestAnimationFrame and setTimeout, so it makes more sense to document this pattern than to add a deferImmediate (or similar) method to the Queue API.