redis/go-redis-entraid

[Bug]: Deadlock in StreamingCredentialsProvider: calling listeners under RWMutex while go-redis unsubscribes on re-auth failure

Closed this issue · 3 comments

Describe the bug

First of all, thank you for maintaining this library it is been really helpful.

I noticed a case where Redis stopped responding as expected and some client requests started timing out. After some investigation, I found that the issue comes from a deadlock in StreamingCredentialsProvider.

The problem happens because listener callbacks (OnNext / OnError) are invoked while holding RLock. If re-auth fails, go-redis may close the connection and trigger the provider's unsubscribe, which then tries to acquire Lock on the same RWMutex. Since RWMutex doesn't allow upgrading a read lock to a write lock, this leads to a deadlock.

As a result, token refresh can hang indefinitely, blocking re-auth updates and potentially the client.

To Reproduce

  1. Provider receives a new token and calls onTokenNext.
  2. onTokenNext acquires RLock and invokes listener.OnNext(t).
  3. ReAuthCredentialsListener.OnNext calls re-auth; on error it triggers onAuthenticationErr.
  4. onAuthenticationErr closes the connection (e.g. bad conn).
  5. Conn.Close() triggers the provider's unsubscribe.
  6. unsubscribe tries to acquire Lock, while RLock is still held.
  7. Deadlock occurs.

Expected behavior

Token refresh should fail gracefully and allow retry/recovery without hanging the provider.

Code Example

func (e *entraidCredentialsProvider) onTokenNext(t *token.Token) {
    e.rwLock.RLock()
    listeners := append([]auth.CredentialsListener(nil), e.listeners...)
    e.rwLock.RUnlock()

    for _, l := range listeners {
        l.OnNext(t)
    }
}

func (e *entraidCredentialsProvider) onTokenError(err error) {
    e.rwLock.RLock()
    listeners := append([]auth.CredentialsListener(nil), e.listeners...)
    e.rwLock.RUnlock()

    for _, l := range listeners {
        l.OnError(err)
    }
}

// This avoids calling external listeners while holding internal locks.

Go Version

1.23.4

Package version

v1.0.4

go-redis Version

v9.9.0

Redis Server Version

6.0.14

Additional environment details

  • The issue is not OS-specific (purely about lock semantics).
  • Observed with the current implementation of credentials_provider.go in this repo.

Additional context

Happy to contribute a PR with the proposed changes and tests if this approach looks good.

Hello @alex-poliushkin, thank you for reporting this. I will take a look shortly.

@alex-poliushkin thank you for reporting once again, the issue should be fixed in the latest patch release.

Great, thanks for fixing it!