async-rs/async-std

Alternative to AsyncDrop

Opened this issue · 0 comments

Background

AsyncDrop is a feature that is very needed in many scenarios.
Among them, the widely discussed ones include:

  • Structured Concurrency, which needs it to asynchronously wait for
    nested tasks to complete to avoid blocking scheduler/runtime threads.
spawn(async {
    let inner_task = spwan(async {
    }
    // 1 - make inner_task "drop = cancel + wait" to achieve structured 
    //     cancellation.
    // 2 - make drop "async" to avoid blocking scheduler/runtime threads. 
})

On the other hand, AsyncDrop is HARD to design and never reaches its
RFC state, since 2019.

Serious CHALLENGES include, e.g., how to forbid
cancelling of async-drops, how to prevent access to an in-progress dropping
struct. And the async WG goal is to do those without any support of runtimes,
which makes it more challenge.

The final design, if we managed to get there, would probably be either imperfect
or complex.

Discussion

There is yet an alternative to AsyncDrop, which makes clean-up work async.

The idea is make futures able to OPT IN to:

  • Aware of cancellation.
  • Do async clean up work before Drop. (And let Drop be a goalkeeper to do what is
    left)

The specific goal is:

  • 1 - Everyone is opt-in, it should work fine if only some or none of existing
    code opt in.
  • 2 - The cancel-ee (typically an "atomic" Future), should be able to know it is
    canceled and be polled till clean-up work is done.
  • 3 - The cancel-er (typically a scheduler, or some Future combiner, such as select)
    should be able to tell its descendants about the cancellation, and poll them if necessary.

Surprisingly, when come to the design, it is rather intuitively straight forward. It
requires nothing magic or complex, and no compiler support is ever needed. That makes
me believe it is probably a right approach to async cancellation.

Design

On the CANCEL-EE (typically an "atomic" Future) side, it needs an API to know
it is canceled.

The shape would be:

fn is_cancelled() -> bool

The intended usage would be:

impl Future for SomeFuture {
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        if is_cancelled() { 
            // transition the state machine to the CANCELING state
        }
        // ...state machine work
    }
}

On the CANCEL-ER side (typically a scheduler, or some Future combiner, such as select),
it needs an API to tell its descendants about the cancellation, and poll them if necessary.

The shape would be:

fn with_cancel_state<F: FnOnce()>(cancelled: bool, poll: F) -> bool

The intended usage would be:

if (!cancel_child || child_cancel_aware) {
    child_cancel_aware = with_cancel_state(cancel_child, || {
        poll(child_future, cx)
    })
}

That's all. Suddenly, all the goals we mentioned above is satisfied.

Analysis

First, let me explain a little about the 2 added API.

  • with_cancel_state() returns true if any descendent of child_future called
    is_cancelled().
    • Thus, parent will need to poll child_future until its descendents finish
      their async-cancel.
    • Then no descendent will call is_cancelled() and with_cancel_state()
      will return false.
    • All the work of parent done.
  • is_cancelled() just receives the parent cancel_child state, as you can see.

Let's check whether all the goals mentioned above is satisfied.

  • Goal 2 & Goal 3: Satisfied literally by the above explanation.
  • Goal 1: need a little further explanation
    • First of all, if no descendant of a parent ever opt in by calling is_cancelled(),
      NO EXTRA POLL is added.
    • Cancel-er of future-combiner type, which does not cancel children, such as join, need
      do nothing to opt in. (They just pass through the parent cancel state and poll).
    • Cancel-er of future-combiner type, which does cancel children, such as select, need
      to use with_cancel_state() to opt in. But if it's not, its descends still partially get
      the async-clean-up ability from its ancestors. (If the cancellation is initialed by its
      ancestors)
    • Runtimes/Schedulers need to use with_cancel_state() to opt in. But if it's not, some
      descents still get the async-clean-up ability. (If they are under an opt-in combiner that
      initialed the cancellation)
    • After all, Drop will do all the left clean-up work as usual. We could see this approach
      a PURE OPTIMIZATION.

Implementation

I won't detail the implementation, because it is obvious.

On the opt-in side,

  • For libraries like tokio/async-std, it requires little work to make all the
    schedulers and combiners opt-in.
  • Only a small set of "atomic" Futures (but probably important ones, such as spawned task) whose Drop
    is async by nature will ever need to opt in.