centiservice/mats3

Improve handling / retry strategy of failed JMS authentications

MartinRossland opened this issue · 2 comments

When we setup Mats against an ActiveMQ message broker which requires authentication, it goes into a fairly aggressive retry mode on authentication failure - ends up doing a bunch of authentication attempts over time.

For an example, we have a single service doing over 4.000 attempts through Mats in 30min and for some reason this results in over 18.000 login attempts on the ActiveMQ message broker.

Following you find the log message from the JmsMatsStageProcessor along with the stacktrace from the JmsMatsJmsException which is thrown when a JMS connection could not be initialized. This is from one of the attempts for the example above where we try to create a connection without username and password (anonymous) against a ActiveMQ which was configured to not allow anonymous connections.

Log statement:

#JMATS# Got JmsMatsJmsException while trying to get new JmsSessionHolder. Chilling a bit, then looping to check run-flag.

Stack trace:

io.mats3.impl.jms.JmsMatsException$JmsMatsJmsException: Got problems when trying to create & start a new JMS Connection.
	at io.mats3.impl.jms.JmsMatsJmsSessionHandler_Pooling$ConnectionWithSessionPool.initializePoolByCreatingJmsConnection(JmsMatsJmsSessionHandler_Pooling.java:288)
	at io.mats3.impl.jms.JmsMatsJmsSessionHandler_Pooling.getSessionHolder_internal(JmsMatsJmsSessionHandler_Pooling.java:228)
	at io.mats3.impl.jms.JmsMatsJmsSessionHandler_Pooling.getSessionHolder(JmsMatsJmsSessionHandler_Pooling.java:159)
	at io.mats3.impl.jms.JmsMatsStageProcessor.runner(JmsMatsStageProcessor.java:267)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: javax.jms.JMSSecurityException: Failed to authenticated as [anonymous] from node [tcp://10.86.97.19:64289] in mode [ANONYMOUS]. Broker is configured to not allow anonymous access.
	at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:54)
	at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1405)
	at org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1488)
	at org.apache.activemq.ActiveMQConnection.start(ActiveMQConnection.java:530)
	at io.mats3.impl.jms.JmsMatsJmsSessionHandler_Pooling$ConnectionWithSessionPool.initializePoolByCreatingJmsConnection(JmsMatsJmsSessionHandler_Pooling.java:277)
	... 4 common frames omitted
Caused by: java.lang.SecurityException: Failed to authenticated as [anonymous] from node [tcp://10.86.97.19:64289] in mode [ANONYMOUS]. Broker is configured to not allow anonymous access.
	at com.skagenfondene.jwactivemq.broker.HatcheryAuthenticationPlugin$HatcheryAuthenticationBroker.authenticate(HatcheryAuthenticationPlugin.java:327)
	at com.skagenfondene.jwactivemq.broker.HatcheryAuthenticationPlugin$HatcheryAuthenticationBroker.addConnection(HatcheryAuthenticationPlugin.java:213)
	at org.apache.activemq.broker.BrokerFilter.addConnection(BrokerFilter.java:99)
	at org.apache.activemq.broker.BrokerFilter.addConnection(BrokerFilter.java:99)
	at org.apache.activemq.broker.BrokerFilter.addConnection(BrokerFilter.java:99)
	at org.apache.activemq.broker.TransportConnection.processAddConnection(TransportConnection.java:852)
	at org.apache.activemq.broker.jmx.ManagedTransportConnection.processAddConnection(ManagedTransportConnection.java:77)
	at org.apache.activemq.command.ConnectionInfo.visit(ConnectionInfo.java:139)
	at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:335)
	at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:200)
	at org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)
	at org.apache.activemq.transport.TransportFilter.onCommand(TransportFilter.java:77)
	at org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireFormatNegotiator.java:125)
	at org.apache.activemq.transport.AbstractInactivityMonitor.onCommand(AbstractInactivityMonitor.java:302)
	at org.apache.activemq.transport.TransportFilter.onCommand(TransportFilter.java:77)
	at org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83)
	at org.apache.activemq.transport.nio.NIOTransport.serviceRead(NIOTransport.java:162)
	at org.apache.activemq.transport.nio.NIOTransport$1.onSelect(NIOTransport.java:81)
	at org.apache.activemq.transport.nio.SelectorSelection.onSelect(SelectorSelection.java:98)
	at org.apache.activemq.transport.nio.SelectorWorker$1.run(SelectorWorker.java:123)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.lang.Thread.run(Thread.java:833)

(The SecurityException message comes from the message broker)

Would it be an idea to improve the handling of failed authentications in Mats? At least have a more "chill" retry strategy when JmsMatsJmsException has a SecurityException cause?

This is a problem that stems from ActiveMQ's JMS ConnectionFactory, together with a fairly aggressive retrying from Mats where it attempts to acquire a connection after a failure. In a degenerate situation like here, where the authentication is erroneous, this will obviously not succeed.

Since there is a set of threads per stage (default 2x cpus), x number of stages per endpoint, x endpoints, there are rather many threads that tries in vain to get a connection. While each individual stage processor actually do have a small sleep to avoid tight-spam-loops in such situations, this does not help much when there are something like hundred or more threads that all want a connection.

The pooling solution should probably hold back the threads in such a situation, instead of letting every thread attempt to actually get a connection. That is, if we've just failed to get a connection, it makes little sense to let the next thread in line immediately try again. Rather, we could hold them all back, thus getting a single retry per retry interval, instead of a heap of retries per retry interval.

The new and default JmsMatsJmsSessionHandler_PoolingSerial does as mused in previous commit: It serializes all accesses to creating JMS Connections, and have an exponential backoff between failed retries.