StephenCleary/AsyncEx

Atomic wait semantics lead to continued processing after cancellation

WalkerCodeRanger opened this issue · 3 comments

I understand and agree with your concerns about "Try" being ambiguous. When I first saw the approach of using already canceled tokens to do atomic waits, I thought it was clever. However, I've since found that it leads to unexpected behavior after cancellation. I recommend that an alternative approach be found.

Explanation:

The typical pattern when using cancellation tokens is that it is not necessary to constantly check the token using ThrowIfCancellationRequested() because any method you pass the token to will do that for you. However, the atomic wait semantics break this. For example, in a method that acquires a lock, does some costly synchronous work then saves the result this can lead to extra processing and side effects even after the token has been canceled. The method below demonstrates this. If the token is already canceled by the time the method is entered, then assuming the lock can be acquired without waiting, it will be, and the costly processing and side effects will occur before the SaveAsync throws an OperationCancelled exception.

public async Task Example(CancellationToken ct = default)
{
    using(await _asyncLock.LockAsync(ct).ConfigureAwait(false))
    {
        var results = CostlyProcessing();
        DoSideEffects(results );
        await SaveAsync(results, ct);
    }
}

This behavior is contrary to the expectations and practices developers have built up from other async APIs. To handle this correctly, the developer must be conscious of which methods support atomic wait and explicitly put a ThrowIfCancellationRequested() before calling them. Given that acquiring a lock is a relatively costly operation, this is always the desired behavior.

Another way to see that the atomic waits approach has a problem is that it doesn't generalize. Imagine if all async APIs followed the pattern that if they could process immediately without delay, then they would continue despite receiving a canceled token. This would frequently lead to execution continuing after the operation was canceled until a delayed operation was hit.

You do bring up some good points. This would, however, be a significant breaking change.

This has the same behavior as Monitor.TryEnter. If you try to enter a Monitor with 0 timeout, it will try to take the lock before returning false. So it's a matter of which convention you want to follow. But I guess it would be a little more obvious if it had a Try in the method name.

My previous argument was based on the effect when not doing an atomic wait. I just needed to actually do an atomic wait for real and found some additional concerns.

When doing an atomic wait, the Lock method could throw an OperationCancelledException, but that exception isn't an error. It is the expected result when failing to acquire the lock. There are three problems with this.

  1. The using statement makes it difficult to put a try/catch around the call to Lock without including other code that could raise an OperationCancelledException because of an actual cancellation.
  2. The try/catch is a lot of boilerplate
  3. An exception is being used for a non-error case raising concerns about the performance implications of using an exception there.

The following code demonstrates what I mean:

public async Task Example(CancellationToken ct = default)
{
	try
	{
		using(_asyncLock.Lock(new CancellationToken(true)).ConfigureAwait(false))
		{
			await SaveAsync(results, ct);
		}
	}
	catch (OperationCancelledException)
	{
		// This will catch both the expected exception from the call to Lock(), and the actual cancellation
		// from the call to SaveAsync(). To avoid this, one would need to try/catch just the Lock() while
		// saving the disposable into a variable to put into a using after the try/catch.
	}
}

I created an extension method to AsyncLock that I think shows what could be a good usage pattern. It follows a convention we established for our codebase of naming methods that do not wait *Now. Using that name isn't necessary, it just demonstrates how such a method could work. (Note that the method doesn't need to be async because it never actually waits to acquire the lock. This allows for the use of an out parameter.)

public async Task Example(CancellationToken ct = default)
{
	using(_asyncLock.LockNow(out var held))
	{
		if (!held)
			return;

		await SaveAsync(results, ct);
	}
}