eclipse-vertx/vertx-sql-client

PgSubscriber occasionally crashes unexpectedly

Closed this issue · 2 comments

Version

vert.x 4.5.7

Context

I have a problem with the io.vertx.pgclient.pubsub.PgSubscriber class provided by io.vertx:vertx-pg-client.

Ocassionally the error from the following stack trace appears and it makes my whole PgSubscriber thread crash. (Every 10.000 times maybe once and maybe connected with a high load of the server at that time.)

2024-05-21T09:48:58,807 - java.lang.NullPointerException: Cannot invoke "io.vertx.pgclient.impl.codec.PgCommandCodec.handleNoticeResponse(io.vertx.pgclient.impl.codec.NoticeResponse)" because the return value of "io.vertx.pgclient.impl.codec.PgCodec.peek()" is null
2024-05-21T09:48:58,808 -  at io.vertx.pgclient.impl.codec.PgDecoder.decodeNotice(PgDecoder.java:276)
2024-05-21T09:48:58,808 -  at io.vertx.pgclient.impl.codec.PgDecoder.decodeMessage(PgDecoder.java:147)
2024-05-21T09:48:58,808 -  at io.vertx.pgclient.impl.codec.PgDecoder.channelRead(PgDecoder.java:123)
2024-05-21T09:48:58,808 -  at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
2024-05-21T09:48:58,808 -  at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
2024-05-21T09:48:58,808 -  at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
2024-05-21T09:48:58,808 -  at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
2024-05-21T09:48:58,808 -  at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
2024-05-21T09:48:58,808 -  at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
2024-05-21T09:48:58,809 -  at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
2024-05-21T09:48:58,809 -  at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
2024-05-21T09:48:58,809 -  at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
2024-05-21T09:48:58,809 -  at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
2024-05-21T09:48:58,809 -  at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
2024-05-21T09:48:58,809 -  at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
2024-05-21T09:48:58,809 -  at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
2024-05-21T09:48:58,809 -  at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
2024-05-21T09:48:58,809 -  at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
2024-05-21T09:48:58,809 -  at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

This error seemes to appear in the core of the pgclient classes and i also have no option if the error haoppens to send out an error mail or to restart my process. Currently i initialise the process like the following but as mentioned the catch blocks are never reached and no log here is printed out or mails sent out.

PgSubscriber subscriber = PgSubscriber.subscriber(vertx, new PgConnectOptions().setPort(dbConfig.getPort())
                    .setHost(dbConfig.getHost()).setDatabase(dbConfig.getDatabase()).setUser(dbConfig.getUser()).setPassword(dbConfig.getPassword()));

            subscriber.connect((ar -> {
                if (ar.succeeded()) {

                    String testSuffix = Config.getInstance().getTest() ? "_test" : "";

                    subscriber.channel("pubsub_shipment" + testSuffix).handler(payload -> {
                        try {
                            if(!StringUtil.isEmpty(payload)) {
                                log.info("Received pubsub_shipment: " + payload);
                                updateInMemoryCollectionsForShipments(new JsonArray().add(payload));
                            }
                        }
                        catch(Exception e) {
                            log.error("Unexpected Error in pubsub for shipments", e);
                            MailUtil.sendTechnicalErrorMail("Shipment PubSub failed", "Unexpected Error in pubsub for shipments:", e);
                        }
                    }).exceptionHandler(event -> {
                        log.error("Error in pubsub for shipments", event);
                        MailUtil.sendTechnicalErrorMail("Shipment PubSub failed", "Error in pubsub for shipments:", event);
                    });

So how i can i prevent this error from happening? And how to set an try/catch or Exception Handler to be able to send out Error Mails ir to reinitiate the subscriber when this error occurs? As you can see in the error log no error log from my application are visible. Only from the internal pgclient classes.

I can imagine this error is very hard to reprdouce. Is it possible at least to surround the place where the Nullpointer happens with a try/catch and print out a warning in the next vertx version, so that the whole process is not crashing? As said when this error occurs neither my catch block or the ExceptionHandler is reached and i am unable to restart the Subscriber process again or to receive a mail that this error occured. So i need to manually check right now every day if the Subscriber is still running or if it might crashed again.

Fixed by c4881b4