What's the possible reason leading difference peak in Server Load, Processor Time and CPU usage?
maomaomqiu opened this issue · 6 comments
Hi all, greatly thanks for many of previous answers, I am now working on a purge task to clear persistent redis region, and when I trigger task, I notice there exist difference in server load, processor time and CPU usage in different region.
Region A
Configuration
configuration Premium 26 GB (2 × 13 GB)
Dashboards
-
the peak indicates the trigger time
there is no obvious increase in Server Load, Processor Time and CPU usage at that period -
I trigger 3 times, every time dashboards are similar
Volume
- Total keys ~8 million, purge task clear ~7 million keys
Condition
- operation per second is 1.3-1.4 k before trigger
- get operation is slightly higher than set operation
Region B
Configuration
configuration Premium 26 GB (2 × 13 GB)
Dashboards
Volume
- Total keys ~6.5 million, purge task clear ~5.5 million keys
- Region A and Region B key distribution are similar
Condition
Purge Task Logic
// List of keys pattern that need get from redis
string[] patterns;
using (ConnectionMultiplexer connection = await ConnectionMultiplexer.ConnectAsync(config))
{
// get server
Iserver server = connection.GetServers().// filter server and check server logic, then get a serer
List<Task> tasks = new List<Task>(patterns.Length);
foreach (var pattern in patterns)
{
tasks.Add(RedisPersistentKeyPurgeAction(connection, server, pattern));
}
await Task.WhenAll(tasks).ConfigureAwait(false);
}
private async Task RedisPersistentKeyPurgeAction(ConnectionMultiplexer connection, IServer server, string pattern)
{
// batchExpire keys from redis
List<string> batchExpireBuffer = new List<string>(50);
var db = connection.GetDatabase();
await using var keys = server.KeysAsync(pattern: pattern).GetAsyncEnumerator();
bool isLastkey = !await keys.MoveNextAsync();
while (! isLastKey)
{
// every proccessed 100 keys, there will be a sleep
await Task.Delay(200);
// every 50 keys or current has reached last key matched, then batch set default time to live to those persistent keys
if (statisfy some condtion)
{
await BatchSetExpiry(batchExpireBuffer, expiry, db);
batchExpireBuffer.Clear();
}
// if it length of batchExpireBuffer < 50
if (batchExpireBuffer.Count < 50)
{
batchExpireBuffer.Add(keys.Current.ToString());
}
// other logic of iterator
....
}
}
private Task BatchSetExpiry(List<string> setExpiryList, int expiry, IDatabase db)
{
IBatch batch = db.CreateBatch();
foreach (var key in setExpiryList)
{
// expiry is default expire time, 12 hours
batch.KeyExpireAsync(key, TimeSpan.FromSeconds(expiry));
}
batch.Execute();
// omit other logic, e.g. exception handling
return Task.CompletedTask;
}
I wonder, do you have any ideas why cause difference?
It seems like this is a server-side question really, not a client one. Is the client doing anything incorrect here? I'm reading your question as "why didn't the first server have the same impact?" - there are many reasons if that's the case from shard counts to SKU sizes, etc. - we can't really speak to server impact here because that's side widely variable depending on the hosting setup, replication, latency, concurrent load, etc.
If you can repro bad load patterns, it'd be best to engage the hosting team here to pose that question. If I'm missing a client-side question though: please clarify, happy to answer.
Thanks @NickCraver , I can repro.
Bad load pattern - it is tolerable, due to peak is only in 1 minutes.
I just wonder the possible root cause, so that if further similar operation needed, I can avoid peak
Hi @NickCraver , could you provide some ways to engage hosting team? Greatly thanks in advance!
@maomaomqiu Please engage support via 'Support + Troubleshoot' on the cache in the Portal
Thanks for reply, you mean azure portal? @philon-msft