PlayFab/consuldotnet

NullReferenceException on cancell token in method AcquireLock

Closed this issue · 7 comments

Hello!
I try lock value by key with AcquireLock. If key already is locked and timeout in token, then I get NullReferenceException.

I use version 0.7.2.1 from NuGet.

Thanks for reporting this.

Do you have a stacktrace that shows where the NRE is? I'm not quite sure how to reproduce this from your description of the issue.

tk24 commented

From Lock.cs:

        private void DisposeCancellationTokenSource()
        {
            var cts = _cts;
            if (cts != null)
            {
                Interlocked.CompareExchange(ref _cts, null, cts);
                cts.Cancel();
            }
        }

        private Task MonitorLock()
        {
            return Task.Factory.StartNew(async () => {
                var cts = _cts;
            }
        }

Possible problems:

  1. Interlocked.CompareExchange => var cts = Interlocked.CompareExchange(ref _cts, null, _cts); if (cts != null) { }
  2. Task.Factory.StartNew execution can start after DisposeCancellationTokenSource => private Task MonitorLock(CancellationToken )

Reproduction is easy:

  1. Set the lock-on-key method AcquireLock
  2. Without removing the first lock try to install on the same key another lock with a timeout
    As a result, when installing the second lock, we always get NRE at the expiration of the timeout instead of OperationCancelledException.

Example:

var client = new ConsulClient();
var distributedLock = await client.AcquireLock("test", CancellationToken.None);
try
{
	try
	{
		var ts = new CancellationTokenSource(TimeSpan.FromSeconds(5));
		var distributedLock2 = await client.AcquireLock("test", ts.Token);
	}
	catch (NullReferenceException e)
	{
		Console.WriteLine(e);
	}
	catch (OperationCanceledException)
	{

	}
}
finally
{
	await distributedLock.Release();
}

StackTrace:
at Consul.Lock.d__24.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Consul.ConsulClient.d__56.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()

Ah, it's this line: await _monitorTask.ConfigureAwait(false); - since the lock was never successfully acquiring, _monitorTask was never starting and never getting set.

Alright, I've made the fix will cut a new release shortly, this issue will be closed when it goes out. Once it does, this call will properly throw Consul.LockNotHeldException.

Thanks! But maybe throw OperationCancelledException instead LockNotHeldExcrption will better? Getting LockNotHeldException is difficult to understand whether the timeout was canceled or the lock was received and for some reason released. Can the OperationCancelledException better reflect the original problem?

The NRE has been resolved in 0.7.2.3. The only places LockNotHeldException is used is to indicate that the lock operation failed: on acquisition, failing due to an error or timeout, and on release to indicate you're trying to release a lock that isn't locked. It does not indicate or imply the lock was successfully acquired at any point.

Ok, Thanks!