StackExchange/StackExchange.Redis

Why the timeout exception occurred when connecting to Azure Cache for Redis using StackExchange.Redis

purin-it opened this issue · 3 comments

Occasionally, a timeout exception occurs when connecting to Azure Cache for Redis using StackExchange.Redis from an application built with .NET Core (C#).
The following is an example of the log output at that time. Can the cause of the timeout exception be identified from this log?

【Exception:StackExchange.Redis.RedisTimeoutException: Timeout performing EXISTS (10000ms), next: SELECT, inst: 0, qu: 0, qs: 0, aw: False, bw: SpinningDown, rs: ReadAsync, ws: Idle, in: 0, last-in: 2, cur-in: 0, sync-ops: 20008366, async-ops: 1, serverEndpoint: (server name redis).redis.cache.windows.net:6380, conn-sec: 198677.16, aoc: 0, mc: 1/1/0, mgr: 10 of 10 available, clientName: RedisClient-15, IOCP: (Busy=0,Free=1000,Min=196,Max=1000), WORKER: (Busy=1,Free=32766,Min=196,Max=32767), POOL: (Threads=24,QueuedItems=0,CompletedItems=365522499,Timers=244), v: 2.7.17.27058 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts )

I checked the link mentioned in the log ( https://stackexchange.github.io/StackExchange.Redis/Timeouts ), but since qu and qs are 0, and the number of Busy IOCP and WORKER is less than the Min, I can't understand why the timeout exception occurred.

Therefore, could you let me know if the cause of the timeout exception can be identified from this log?
If it can be identified, please explain how to read the log. If it cannot be identified, please advise what other information should be investigated.

By the way, the operational status of this system is as follows:

  1. A timeout exception occurs if there is no response within 10 seconds.
  2. The version of StackExchange.Redis being used is 2.7.17.27058.
  3. Timeout exceptions occur no more than 20 times a day. Sometimes, there are fewer than 10.
  4. After checking the metrics, no anomalies were found in the connections or threads.
  5. Even when a timeout exception occurs, the system returns to normal operation within a few minutes.
  6. The application built with .NET Core (C#) runs on Azure Kubernetes.
  7. The version of .NET Core (C#) is 6.0.

The log looks good to me.

The usual suspects in these cases are network blips and GC.

Judging by the number of completed items in the thread pool, your application stays running for a long time, which sometimes can lead to a complex tetris game inside the heap segments during GC.

Have you tried profiling the application and seeing how often GC happens and how long it takes?

Thanks for the response @quassnoi . I will first check the GC occurrences.

Thank you for your assistance. Regarding the investigation of GC occurrences as advised, we have confirmed how to do it, but since the frequency of Redis Timeout Exceptions is low, we have decided not to conduct the investigation of GC occurrences.

Then I close this issue.