igniterealtime/openfire-hazelcast-plugin

Dynamically created ClusteredCache entries not expiring

GregDThomas opened this issue · 7 comments

[Transferred from https://issues.igniterealtime.org/projects/HZ/issues/HZ-6]

When hazelcast clustering is enabled, a plugin can create it's own Cache, using something like:

final String cacheName = "aaa.test-cache";
CacheFactory.setMaxSizeProperty(cacheName, maxCacheSize);
CacheFactory.setMaxLifetimeProperty(cacheName, maxLifetime);

final Cache<String, String> cache = CacheFactory.createCache(cacheName);

cache.put("key", "value");
assertThat(cache.get("key"), is("value"));
Thread.sleep(maxLifetime+1);
assertThat(cache.get("key"), is(nullValue()));

This appears to work because it is possible to dynamically configure a Hazelcast IMap (which is what backs the Openfire Cache when clustered) provided the IMap has not already been created.

However, as soon as a second node joins the cluster, the second node will create an IMap instance for each one existing on the first node. As there is no explicit config for "aaa.test-cache" it gets created on the second node with default values (set to infinite in hazelcast-cache-config.xml) - and this config is then reflected back on the first node. As a result, as soon as the second node joins, entries stop expiring on all nodes.

Potential solution include:

  1. Remember to manually edit the single hazelcast-cache-config.xml file on each server every time a plugin is installed that uses a Cache. In which case this should be closed as "Won't fix".
    Pros: No development effort
    Cons: Manual admin process required, no doubt prone to error

  2. When ClusteredCache puts an entry in the IMap, it could use map.put(key, object, maxLifetime, TimeUnit.Milliseconds) to explicitly set the expiry time on the entry, instead of the cache.
    Pros: Trivial to implement (already done to see if it works!)
    Cons: Doesn't work for the putAll(Map) method which has no equivalent. Doesn't work for the setMaxSizeProperty() property.

  3. HZ config files support an directive. The Hazelcast plugin could find any plugin specific cache configurations (cf. the coherence clustering plugin with the cache-config.xml file), and programmatically add these files to the cluster config before clustering starts.
    Pros: Works in all cases
    Cons: Non-trivial development effort.

  4. Upgrade to HZ 3.9, which supports dynamic configuration
    Pros: Works in all cases with little development effort (depending on how much the API has changed).
    Cons: HZ 3.9 is fairly new (only two days old at the time of writing!) - some time to allow others to find any bugs might be beneficial.

I'm open to any other suggestions, thoughts, etc. In the meantime, I'm going to implement option 2 locally as a short-term work-around.

[Comment from Greg Thomas @ https://issues.igniterealtime.org/browse/HZ-6?focusedCommentId=26625]

Note; option 2 was implemented in plugin v. 2.2.4 - however the putall()/setMaxSize() is still not properly honoured.

An update on Option 4, my current preferred solution;

According to hazelcast/hazelcast#10860 (comment) with Hazelcast 3.9+, a dynamic cache is now propagated by the creating node to all nodes in the cluster. So what is required is that the ClusteredCacheFactory should create a MapConfig appropriately configured for the map in question when it is created.

NOTE: The following is what I originally thought. I now believe it to be largely wrong, will leave it here to remind me of my past mistakes.

The IMap backing the clustered cache must be configured before the node joins the cluster.
Therefore, when the Hazelcast plugin starts, it is necessary to;
a) For each Cache that has been or could be created ...
b) Configure Hazelcast with the settings for that Cache for maxLifetime/maxSize
c) And only then join the cluster, and migrate each DefaultCache to a ClusteredCache,

NOTE: It is not sufficient to use every current DefaultCache to determine which Hazelcast IMap settings are required. It may be that a plugin (or Openfire) creates a Cache after the cluster has been joined. That Cache must be configured on all nodes in the cluster before any node creates it - or it will end up with default settings.

My suggestion is therefore;
a) When any Cache (DefaultCache or ClusteredCache) is created in Openfire, the Primary server adds it to a list persisted in the database (ofProperty?)
b) When the Hazelcast plugin starts up, it pre-configures a Hazelcast IMap with settings for all of these cache's

Drawback 1: If a new Cache is added after the cluster is joined, it will not have the correct settings until the entire cluster is restarted.
Drawback 2: The list of Cache's will keep growing. How to housekeep; perhaps purge unused entries every 24 hours?

This puzzles me: when a cache is created, either through declaration or programmatically, I'd expect that on a new cluster node, the same cache is created (as it'll have the same XML, or run the same code). How is that not the case?

a) It is the case since HZ 3.9, but
b) We're not actually (correctly) configuring the IMap today, instead relying on the defaults (100,000 entries / no expiry) and overriding the maxLifetime only when doing a put()

Even before 3.9, I mean. Assuming that we're running with the same configuration and code, behavior should be consistent across nodes - although apparently, it is not. I don't get why two cluster nodes that both instantiate the same Cache using the same code end up with two differently configured caches.

No, nor do I. Spent fairly (ahem) ages tracking that one down, only to find out it was fixed in 3.9. This is possibly a reminder to me to get around to fixing it properly, now.

As we just discussed in the Meet - let's go with fixing it with option 4, knowing that we'll probably lose the dynamic reconfiguration functionality that is a result of the current hack that we have in (adding expiry on every 'add'). We should prefer having consistent behavior regarding configuration of properties like maxLifetime and maxSize.