davidjrh/dnn.rediscachingprovider

Cache Coherency Issue

Closed this issue · 6 comments

I've been using this cache provider in production for a few months now and is has been super. One thing that I have noticed is that if you make a change, it is not reflected on the other nodes immediatly. After thinking about it some, I realized there may be an issue, there is no mechanism for one machine to notify the others that their cache is invalid. Thus the only things that allow for the re-sync of the cache are time and an app pool restart.

Consider this code:
public override object GetItem(string key)
{
try
{
--------->> Never check to see if the local copy is current
var v = base.GetItem(key);
if (v != null)
{
return v;
}
var value = RedisCache.StringGet(KeyPrefix + key);
if (value.HasValue)

public override void Insert(string key, object value, DNNCacheDependency dependency, DateTime absoluteExpiration, TimeSpan slidingExpiration, CacheItemPriority priority,
CacheItemRemovedCallback onRemoveCallback)
{
try
{
// Calculate expiry
TimeSpan? expiry = null;
if (absoluteExpiration != DateTime.MinValue)
{
expiry = absoluteExpiration.Subtract(DateTime.UtcNow);
}
else
{
if (slidingExpiration != TimeSpan.Zero)
{
expiry = slidingExpiration;
}
}
-----------------> don't tell the other machines that we just updated the object.
if (UseCompression)
{
var cvalue = CompressData(value);
base.Insert(key, cvalue, dependency, absoluteExpiration, slidingExpiration,
priority, onRemoveCallback);
RedisCache.StringSet(KeyPrefix + key, Serialize(cvalue), expiry);
}
else
{
base.Insert(key, value, dependency, absoluteExpiration, slidingExpiration,
priority, onRemoveCallback);
RedisCache.StringSet(KeyPrefix + key, Serialize(value), expiry);
}

If "Machine A" has a local copy of some item and then "Machine B" updates it, "Machine A" will not see the updated object until it's local copy has aged out.

This is annoying but ok if the machines are behind a load balancer that provides for processor affinity. However, if a session is not pinned to a machine, the results are likely to be chaotic and possibly destructive.

There are a number of solutions.

One is to keep track of all the nodes (in the webservers table) and create a mechanism to notify all the processors that they need to invalidate their local copy. I think the paid version of DNN does something along this line. I had a conversation about it years ago w/Nick when the SQL based caching provider was removed.

Another is to not cache locally and always go for the Redis Cache.

Personally, I like the latter

  1. It is a simple solution and easy to get correct.
  2. Redis is super fast, I have used it (and memcache) in other similar situations and have never been worried about the latency (assuming that Redis is reasonably local and configured correctly)
  3. Reduces the memory footprint of the site, no more using local memory for cache storage (this matters in a shared hosting environment)

I suggest a parameter that allows the local cache to be turned off.

I've been working on a similar issue for almost a month now non-stop. I updated the Insert Method to add this to the try-catch block

finally 
{
    Logger.Info("Insert is telling other members to clear " + cacheKey);
    // Notify the channel
    RedisCache.Publish(new RedisChannel(KeyPrefix + "Redis.Insert", RedisChannel.PatternMode.Auto), InstanceUniqueId + "_" + cacheKey);
}

So now, any time a new key is inserted on System A, system B will receive the message and clear the cache key from memory on system B. Now, I've also added other debug messages to show that when I refresh the page on system B, the cache key DNN_TabModules55 is not found in memory, but it is reportedly querying Redis. Here's the rub, though. I'm running redis-cli monitor | find /I "get" and there are no calls for the cache key I'm using for debug.

Don't you need to do a subscribe and hook up a handler so that the publishes are seen?
The publish should return the number of subscribers that got the message.

This is already handled in the ProcessMessage method. The else statement will cause anyone subscribed to the channel (subscribers are created via PSUBSCRIBE in Redis and created via this line cn.GetSubscriber().Subscribe(new RedisChannel(KeyPrefix + "Redis.*", RedisChannel.PatternMode.Pattern), ProcessMessage);), to call RemoveInternal(cacheKey) which is contained in the Publish call.

I see... doh... I'll give it a play and see what I come up with.
I like this way MUCH better.

Hi, I found the issue causing the out of sync cache. On parametrized ClearCache calls, the "type" and "data" parameters were not being published on the Redis channel, so were not effectively being cleared on the other webservers.

image

I have added those parameters and now is working properly.I have created a new release 1.0.4 with the fix.
image

Super, thanks. I just applied the change, looking good so far!!