NullPointerException when loading plugin on senior cluster node

Question

NullPointerException when loading plugin on senior cluster node

Closed this issue 4 years ago · 7 comments

Just after loading a new version of the plugin on the senior cluster node, this got logged:

2020.09.04 12:41:34 ERROR [hz.openfire.cached.thread-1]: org.jivesoftware.openfire.plugin.util.cache.ClusteredCacheFactory - Unexpected exception running CallableTask[org.jivesoftware.openfire.archive.cluster.SendConversationEventsTask]
java.lang.NullPointerException: null
        at org.jivesoftware.openfire.archive.cluster.SendConversationEventsTask.run(SendConversationEventsTask.java:63) ~[?:?]
        at org.jivesoftware.openfire.plugin.util.cache.ClusteredCacheFactory$CallableTask.call(ClusteredCacheFactory.java:591) [hazelcast-2.5.0.jar!/:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
        at com.hazelcast.executor.impl.DistributedExecutorService$CallableProcessor.run(DistributedExecutorService.java:270) [hazelcast-3.12.5.jar!/:?]
        at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:227) [hazelcast-3.12.5.jar!/:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
        at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:64) [hazelcast-3.12.5.jar!/:?]
        at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:80) [hazelcast-3.12.5.jar!/:?]
2020.09.04 12:41:35 INFO  [pool-29118-thread-1]: com.reucon.openfire.plugin.archive.impl.JdbcPersistenceManager - Running database table migration to split full JID values into bare JID and resource-part components.
2020.09.04 12:41:35 DEBUG [pool-29118-thread-1]: com.reucon.openfire.plugin.archive.impl.JdbcPersistenceManager - Find columns with full JIDs in the fromJID column: these are rows that have NULL in the resource column, and a slash in the JID column.
2020.09.04 12:41:47 WARN  [hz.openfire.SlowOperationDetectorThread]: com.hazelcast.spi.impl.operationexecutor.slowoperationdetector.SlowOperationDetector - [172.31.25.193]:5701 [openfire] [3.12.5] Slow operation detected: com.hazelcast.executor.impl.operations.MemberCallableTaskOperation
Hint: You can enable the logging of stacktraces with the following system property: -Dhazelcast.slow.operation.detector.stacktrace.logging.enabled
2020.09.04 12:41:50 WARN  [hz.openfire.SlowOperationDetectorThread]: com.hazelcast.spi.impl.operationexecutor.slowoperationdetector.SlowOperationDetector - [172.31.25.193]:5701 [openfire] [3.12.5] Slow operation detected: com.hazelcast.executor.impl.operations.MemberCallableTaskOperation (2 invocations)

Answer 1 · 2020-09-04T12:46:39.000Z

And a bit later:

2020.09.04 12:45:21 ERROR [hz.openfire.cached.thread-7]: org.jivesoftware.openfire.plugin.util.cache.ClusteredCacheFactory - Unexpected exception running CallableTask[org.jivesoftware.openfire.archive.cluster.SendConversationEventsTask]
java.lang.ClassCastException: org.jivesoftware.openfire.plugin.MonitoringPlugin cannot be cast to org.jivesoftware.openfire.plugin.MonitoringPlugin
        at org.jivesoftware.openfire.archive.cluster.SendConversationEventsTask.run(SendConversationEventsTask.java:61) ~[monitoring-2.0.2-SNAPSHOT.jar!/:?]
        at org.jivesoftware.openfire.plugin.util.cache.ClusteredCacheFactory$CallableTask.call(ClusteredCacheFactory.java:591) [hazelcast-2.5.0.jar!/:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
        at com.hazelcast.executor.impl.DistributedExecutorService$CallableProcessor.run(DistributedExecutorService.java:270) [hazelcast-3.12.5.jar!/:?]
        at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:227) [hazelcast-3.12.5.jar!/:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
        at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:64) [hazelcast-3.12.5.jar!/:?]
        at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:80) [hazelcast-3.12.5.jar!/:?]

Answer 2 · 2020-09-04T13:16:27.000Z

What appears to be happening is that a cluster task is executed while various components are not yet (or no longer) properly configured, either during startup or shutdown

Answer 3 · 2020-09-04T13:22:37.000Z

I wonder if this is a side-effect of having #113 run in the background, preventing full initialization of some components.

Answer 4 · 2020-09-04T13:23:24.000Z

I often see this (used to?) with clustered plugins. It's generally best to only start clustering operations once clustering has started (!)

A naive plugin will start on "initialize" - which may be before clustering has started. So if clustering starts in the middle of plugin initialization, things start to break.

Answer 5 · 2020-09-04T13:31:11.000Z

Although very likely true, that was not the case with this instance: I deployed a plugin in a node that was already running (for a couple of hours, at least).

Answer 6 · 2020-09-04T14:22:43.000Z

This gets logged repeatedly on a cluster node where I've unloaded the plugin that's providing the task that's throwing this exception.

Explicitly shutting down tasks before unloading the plugin might help.

2020.09.04 14:21:07 ERROR [hz.openfire.cached.thread-5]: org.jivesoftware.openfire.plugin.util.cache.ClusteredCacheFactory - Unexpected exception running CallableTask[org.jivesoftware.openfire.archive.cluster.SendConversationEventsTask]
java.lang.NullPointerException: null
        at org.jivesoftware.openfire.archive.cluster.SendConversationEventsTask.run(SendConversationEventsTask.java:63) ~[?:?]
        at org.jivesoftware.openfire.plugin.util.cache.ClusteredCacheFactory$CallableTask.call(ClusteredCacheFactory.java:591) [hazelcast-2.5.0.jar!/:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
        at com.hazelcast.executor.impl.DistributedExecutorService$CallableProcessor.run(DistributedExecutorService.java:270) [hazelcast-3.12.5.jar!/:?]
        at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:227) [hazelcast-3.12.5.jar!/:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
        at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:64) [hazelcast-3.12.5.jar!/:?]
        at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:80) [hazelcast-3.12.5.jar!/:?]

Answer 7 · 2020-09-07T08:53:26.000Z

I'm thinking that the ClassCastException is unrelated to the NullPointerException that's the subject of this issue. I've raised a new issue for the ClassCastException: #115