RabbitMQ connection retries halted the service startup (and may have crashed the server)
Closed this issue · 3 comments
Foxcapades commented
ERROR com.rabbitmq.client.impl.ForgivingExceptionHandler - Caught an exception when recovering topology Caught an exception while recovering exchange qa-vdi-bucket-notifications: connection is already closed due to connection error; cause: com.rabbitmq.client.MissedHeartbeat
Exception: Detected missed server heartbeats, heartbeat interval: 60 seconds, RabbitMQ node hostname: 172.16.44.201
com.rabbitmq.client.TopologyRecoveryException: Caught an exception while recovering exchange qa-vdi-bucket-notifications: connection is already closed due to connection error; cause: com.rabbitmq.client.MissedHeartbeatException: Detected missed server heartbeats, heartbeat interval: 60 seconds, RabbitMQ node hostnam
e: 172.16.44.201
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.recoverExchange(AutorecoveringConnection.java:770) ~[service.jar:?]
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.recoverTopology(AutorecoveringConnection.java:723) ~[service.jar:?]
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.beginAutomaticRecovery(AutorecoveringConnection.java:602) ~[service.jar:?]
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.lambda$addAutomaticRecoveryListener$3(AutorecoveringConnection.java:524) ~[service.jar:?]
at com.rabbitmq.client.impl.AMQConnection.notifyRecoveryCanBeginListeners(AMQConnection.java:839) ~[service.jar:?]
at com.rabbitmq.client.impl.AMQConnection.doFinalShutdown(AMQConnection.java:816) ~[service.jar:?]
at com.rabbitmq.client.impl.AMQConnection.handleHeartbeatFailure(AMQConnection.java:781) ~[service.jar:?]
at com.rabbitmq.client.impl.nio.NioLoop.lambda$handleHeartbeatFailure$0(NioLoop.java:281) ~[service.jar:?]
at java.lang.Thread.run(Thread.java:1589) [?:?]
Caused by: com.rabbitmq.client.AlreadyClosedException: connection is already closed due to connection error; cause: com.rabbitmq.client.MissedHeartbeatException: Detected missed server heartbeats, heartbeat interval: 60 seconds, RabbitMQ node hostname: 172.16.44.201
at com.rabbitmq.client.impl.AMQChannel.ensureIsOpen(AMQChannel.java:281) ~[service.jar:?]
at com.rabbitmq.client.impl.AMQChannel.rpc(AMQChannel.java:365) ~[service.jar:?]
at com.rabbitmq.client.impl.AMQChannel.privateRpc(AMQChannel.java:305) ~[service.jar:?]
at com.rabbitmq.client.impl.AMQChannel.exnWrappingRpc(AMQChannel.java:152) ~[service.jar:?]
at com.rabbitmq.client.impl.ChannelN.exchangeDeclare(ChannelN.java:804) ~[service.jar:?]
at com.rabbitmq.client.impl.ChannelN.exchangeDeclare(ChannelN.java:746) ~[service.jar:?]
at com.rabbitmq.client.impl.ChannelN.exchangeDeclare(ChannelN.java:47) ~[service.jar:?]
at com.rabbitmq.client.impl.recovery.RecordedExchange.recover(RecordedExchange.java:36) ~[service.jar:?]
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.lambda$recoverExchange$12(AutorecoveringConnection.java:759) ~[service.jar:?]
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.wrapRetryIfNecessary(AutorecoveringConnection.java:914) ~[service.jar:?]
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.recoverExchange(AutorecoveringConnection.java:758) ~[service.jar:?]
... 8 more
Foxcapades commented
It seems the error was caught and logged by the RabbitMQ client itself. It seems like the service went down because there is nothing else logged after this error, but whether the server was online or not was not confirmed.
Foxcapades commented
Possibly relates to #13
Foxcapades commented
This particular situation happened because RabbitMQ went down after VDI had established it's initial connection but somehow didn't send a shutdown signal?
To test this locally we would likely need to create a custom script that accepts a TCP socket connection on port 5672
and then immediately crashes without sending any data.