Cache key gets stuck for ever
Closed this issue · 8 comments
We have noticed that some times cache key gets stuck in cache for ever like I set expiration = infinite.
Helps only restart pod.
I am not sure is it related, but we also use remove method (for hot reload feature when cache key should be expired by domain event)
if (Cached<TValue>.TryGet(internalKey, out var cacheValue))
cacheValue.Remove();
For now we noticed this problem only with types that has hot reload feature, so I guess it mb related
Cache key type in our case (if it is important):
record struct InternalCacheKey<TKey>(TKey Key, string MethodName, string FilePath);
TKey is (short id, short langId)
but we had same issue with different keys.
expirationTime = TimeSpan.FromSeconds(600)
(10 minutes)
I tried to reproduce it but had no success.
I also happens not so often in production, but brings a lot of problems.
Mb you have any idea of reason of this bug?
We use only 3 methods:
Cached<TValue>.TryGet
(for one key)
Cached<TValue>.Save
(for one key)
cacheValue.Remove()
Hi, this sounds like something that may occur with struct tearing (observing partial updates of cache item since it is a struct that holds TValue and long expiration timestamp), but in practice it should not be possible to reach since the dictionary implementation should be swapping the node reference rather than contents.
Could you give more details on hot reload feature you are using in this context?
Hi, hot reload is just listening some domain events and remove key from cache by call Remove to force next reading request to reread it from db and save to cache.
Example of removing by event for one of key:
//short id, short langId - we've got from event data
var internalKey = new InternalCacheKey((id, langId), methodName, filePath); //build key to find it in cache
if (Cached<TValue>.TryGet(internalKey, out var cacheValue))
cacheValue.Remove();
A little update:
I've got dump from pod with problem. And I've found corresponding entry in cache dictionary _entries.
And honestly I don't see any problem - this entry marked with TOMBSTONE (static value) in value.
So, It is "deleted" value. And I locally make the same value in the cache and everything works well- cache just return false by TryGet method.
So, for now I have no idea how the cache return some value from cache))
I suspected there is another entry instance with the same key, but no - only one in dump.
And also only one instance of entries array and only one instance of DictionaryImpl.
Here entry value from dump:
{
"hash": -159001085,
"key": {
"@ref": "0x00007fd2f1da13d0",
"@type": "NonBlocking.Boxed<FastMemoryCache+InternalCacheKey<ValueTuple<Int16, Int16>>>",
"writeStatus": 0,
"Value.<Key>k__BackingField.Item1": 118,
"Value.<Key>k__BackingField.Item2": 36,
"Value.<MethodName>k__BackingField": "someMethodName",
"Value.<FilePath>k__BackingField": "/__w/1/s/somePathToFile.cs"
},
"value": {
"@ref": "0x00007fd2dd237e68",//address of TOMBSTONE
"@type": "System.Object",
"@comment": "written above"
}
}
0x00007fd2dd237e68 is address of TOMBSTONE - I've understand it because it is empty object and I've found the same address in PRIME object instance:
{
"@ref": "0x00007fd2dd237e80",
"@type": "NonBlocking.DictionaryImpl+Prime",
"originalValue": {
"@ref": "0x00007fd2dd237e68",
"@type": "System.Object"
}
}
For now I couldn't understand reason but I'l continue investigation. Mb next time I should get dump with type="WithHeaps" instead of "Full" to be able to debug it.
Interesting, and thanks for looking into this further. I completely forgot about this issue but will look into NonBlocking dictionary impl. again on the off chance it has a race condition or a logic bug that may lead to such a scenario. Indeed, TOMBSTONE entry should never be returned...
Going back to the issue description and to clarify - in which way the cache item being stuck manifests in your application code? Just TryGet
returning true
or something else?
Yes, look like TryGet returning true. Because value in db was updated and other pods started return new value. But one pod returns old one, even after few hours. expirationTime = TimeSpan.FromSeconds(600) (10 minutes).
Even if TOMBSTONE returned I could not imagine how it could be transformed to our cached object.
We use object just after getting it from cache and there is no exception (like nullreference ).
Thanks for your attention)
Sorry for bothered you. We have found mysterious problem on our side. The root is unknown, but for now everything related to cache is good. I'll close this issue.
Thank you for responses and excuse me for wasting your time.
No problem, I did try to reproduce the problem but couldn't, and unfortunately it doesn't mean there isn't some rare condition that has a bug - concurrent data structures always scared me when it comes to these. Thanks for reporting.