smol-rs/async-task

Efficiently support bounded queues for scheduling?

jasta opened this issue · 5 comments

jasta commented

I'm interested in using a bounded queue for scheduling but all the existing examples use an unbounded queue and I'm not sure how to make it efficient and runtime safe (i.e. no panics) to switch to a bounded queue. I'd like my spawn function to instead return a kind of busy/full error. I've created an example repo that shows some of the challenges I faced trying to get this working: https://github.com/jasta/async-task-bounded-example, which does work but with a couple major caveats:

  1. It seems we need to inefficiently track the spawned Task's and loop through all of them each time runnable.run() is called to make sure the finished ones are cleared out. This tracking lets us know if we were to have more than n unfinished spawned tasks that could all spuriously wake up at the same time and fill our queue beyond n capacity.
  2. It's not clear if schedule() can ever occur twice before run(), definitely breaking (1) above if it did. I know it's not supposed to happen nominally, but can it under edge cases? run() seems idempotent, so if it was happening today even if by mistake I'm not sure you'd get any adverse runtime behaviour.

I'd propose that to make this more reliable and efficient we add an API along-side run() that returns whether the task is finished (maybe an enum return that tells us strictly more than the current run). Perhaps also a stateful check and associated tests for schedule to make sure it can't be run twice if that is indeed the case; or if not, finding another way to guarantee that we can efficiently protect against the schedule queue becoming full.

For some context, this is to make it possible to call wake() (and thus schedule tasks) from an ISR function in an embedded system it is important that we not block the routine or call into libc functions, including malloc (see Introduction to RTOS - Hardware Interrupts for more reading).

EDIT: The repository has been updated with an efficient and correct solution.

jasta commented

For further context, I'm looking at addressing this in ivmarkov/edge-executor#1

Here's an idea: what if we add an API to FallibleTask that lets you get back the Runnable on failure? It could be awaited on the task, and it returns one of:

  • Ok(val) for a successful task completion.
  • Err(AlreadyDropped) if the Runnable was already dropped before the function was called.
  • Err(Dropped(runnable)) if the runnable is dropped while the future is running. This way the user can recover the runnable.

In this case it would be up to the user to determine what to do with the runnable; either drop it or put it someplace else. Then for your use case you write a wrapper around FallibleTask that accommodates for this.

jasta commented

Hmm, I'm not sure exactly how that helps but it did give me an idea I hadn't considered before. I could just wrap the input future like:

async move {
  fut.await;
  unfinished_tasks.fetch_sub(1, Relaxed);
}

This actually seems to solve the problem quite elegantly, though it's unclear still re (2): can Runnable::schedule() be called twice somehow? The ownership rules seems to say no, but the liberal usage of unsafe makes that claim hard to verify.

EDIT: I pushed new commits to the github repo that explores this solution and it's definitely on the surface efficient and clean enough to work, but the issue still remains as to whether (2) will be a problem.

There should be no way for two Runnables to exist at once, if that happens that is a critical safety bug. The schedule function should only ever be called after running runnable.schedule() or runnable.run().

jasta commented

Ok then closing this issue with an acceptable solution above, thanks for the discussion!