puppetlabs/puppetdb

java.nio.channels.CancelledKeyException

jjarokergc opened this issue · 4 comments

I am attempting to troubleshoot the following problem and was hoping someone could point me in the right direction. Is this a problem in puppet dashboard or puppetdb?

Describe the Bug

Puppetdb is failing with the error below in a fresh installation of a 3-node setup of puppet master, db and db01 in a staging environment:

  • puppetdb version: 8.1.0
  • psql (PostgreSQL) 14.9 (Ubuntu 14.9-1.pgdg22.04+1)
  • puppetserver version: 8.2.0
  • Ubuntu 22

The same configuration is operating normally in my production environment:

  • puppetdb version: 7.13.1
  • psql (PostgreSQL) 11.21 (Ubuntu 11.21-1.pgdg20.04+1)
  • puppetserver version: 7.12.0
  • Ubuntu 20

This sometimes appears on puppet dashboard the following error. Other times, the 3 nodes are shown in "unreported" state (see below).

Internal Server Error
This error usually occurs because:
    We were unable to reach PuppetDB;
    The query to be executed was malformed resulting in an incorrectly encoded request.

Puppet server appears to work properly with no errors, so I am not sure if this defect is in puppet dashboard or puppetdb.

Steps to Reproduce

  1. New 3-node installation of server, db01, puppetdb, puppetdashboard (versions above)
  2. Accessing puppetdb web page fails

Additional Context

Unreported states
image

Stack Trace

2023-08-25T13:12:47.626-04:00 ERROR [p.p.middleware] #error {
 :cause nil
 :via
 [{:type java.nio.channels.CancelledKeyException
   :message nil
   :at [sun.nio.ch.SelectionKeyImpl ensureValid SelectionKeyImpl.java 75]}]
 :trace
 [[sun.nio.ch.SelectionKeyImpl ensureValid SelectionKeyImpl.java 75]
  [sun.nio.ch.SelectionKeyImpl interestOps SelectionKeyImpl.java 104]
  [java.nio.channels.spi.AbstractSelectableChannel register AbstractSelectableChannel.java 233]
  [java.nio.channels.SelectableChannel register SelectableChannel.java 260]
  [jdk.internal.reflect.GeneratedMethodAccessor19 invoke nil -1]
  [jdk.internal.reflect.DelegatingMethodAccessorImpl invoke DelegatingMethodAccessorImpl.java 43]
  [java.lang.reflect.Method invoke Method.java 568]
  [clojure.lang.Reflector invokeMatchingMethod Reflector.java 167]
  [clojure.lang.Reflector invokeInstanceMethod Reflector.java 102]
  [puppetlabs.puppetdb.query.monitor$stop_query_at_deadline_or_disconnect invokeStatic monitor.clj 328]
  [puppetlabs.puppetdb.query.monitor$stop_query_at_deadline_or_disconnect invoke monitor.clj 324]
  [puppetlabs.puppetdb.http.query$wrap_typical_query$fn__31076$fn__31078 invoke query.clj 434]
  [puppetlabs.puppetdb.http.query$wrap_typical_query$fn__31076 invoke query.clj 404]
  [bidi.ring$fn__35985 invokeStatic ring.cljc 25]
  [bidi.ring$fn__35985 invoke ring.cljc 21]
  [bidi.ring$fn__35970$G__35965__35979 invoke ring.cljc 16]
  [puppetlabs.puppetdb.middleware$fn__37279$make_pdb_handler__37288$fn__37291$fn__37293 invoke middleware.clj 429]
  [puppetlabs.puppetdb.middleware$wrap_with_illegal_argument_catch$fn__37121 invoke middleware.clj 100]
  [puppetlabs.puppetdb.middleware$verify_accepts_content_type$fn__37133 invoke middleware.clj 148]
  [puppetlabs.puppetdb.middleware$verify_content_type$fn__37145 invoke middleware.clj 181]
  [puppetlabs.puppetdb.middleware$verify_sync_version$fn__37229 invoke middleware.clj 363]
  [puppetlabs.puppetdb.middleware$wrap_with_metrics$fn__37190$fn__37199 invoke middleware.clj 281]
  [puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529$fn__26530 invoke metrics.clj 23]
  [puppetlabs.puppetdb.utils.metrics.proxy$java.lang.Object$Callable$7da976d4 call nil -1]
  [com.codahale.metrics.Timer time Timer.java 101]
  [puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529 invoke metrics.clj 23]
  [puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529$fn__26530 invoke metrics.clj 23]
  [puppetlabs.puppetdb.utils.metrics.proxy$java.lang.Object$Callable$7da976d4 call nil -1]
  [com.codahale.metrics.Timer time Timer.java 101]
  [puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529 invoke metrics.clj 23]
  [puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529$fn__26530 invoke metrics.clj 23]
  [puppetlabs.puppetdb.utils.metrics.proxy$java.lang.Object$Callable$7da976d4 call nil -1]
  [com.codahale.metrics.Timer time Timer.java 101]
  [puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529 invoke metrics.clj 23]
  [puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529$fn__26530 invoke metrics.clj 23]
  [puppetlabs.puppetdb.utils.metrics.proxy$java.lang.Object$Callable$7da976d4 call nil -1]
  [com.codahale.metrics.Timer time Timer.java 101]
  [puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529 invoke metrics.clj 23]
  [puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_ invokeStatic metrics.clj 26]
  [puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_ invoke metrics.clj 15]
  [puppetlabs.puppetdb.middleware$wrap_with_metrics$fn__37190 invoke middleware.clj 280]
  [puppetlabs.puppetdb.middleware$wrap_with_globals$fn__37116 invoke middleware.clj 94]
  [puppetlabs.puppetdb.middleware$wrap_with_exception_handling$fn__37125 invoke middleware.clj 115]
  [puppetlabs.puppetdb.http.server$build_app$fn__42708 invoke server.clj 80]
  [compojure.core$routing$fn__36589 invoke core.clj 151]
  [clojure.core$some invokeStatic core.clj 2718]
  [clojure.core$some invoke core.clj 2709]
  [compojure.core$routing invokeStatic core.clj 151]
  [compojure.core$routing doInvoke core.clj 148]
  [clojure.lang.RestFn invoke RestFn.java 423]
  [puppetlabs.puppetdb.pdb_routing$wrap_with_context$fn__46557 invoke pdb_routing.clj 32]
  [compojure.core$if_context$fn__36613 invoke core.clj 218]
  [compojure.core$routing$fn__36589 invoke core.clj 151]
  [clojure.core$some invokeStatic core.clj 2718]
  [clojure.core$some invoke core.clj 2709]
  [compojure.core$routing invokeStatic core.clj 151]
  [compojure.core$routing doInvoke core.clj 148]
  [clojure.lang.RestFn applyTo RestFn.java 139]
  [clojure.core$apply invokeStatic core.clj 669]
  [clojure.core$apply invoke core.clj 662]
  [compojure.core$routes$fn__36593 invoke core.clj 156]
  [compojure.core$routing$fn__36589 invoke core.clj 151]
  [clojure.core$some invokeStatic core.clj 2718]
  [clojure.core$some invoke core.clj 2709]
  [compojure.core$routing invokeStatic core.clj 151]
  [compojure.core$routing doInvoke core.clj 148]
  [clojure.lang.RestFn invoke RestFn.java 460]
  [puppetlabs.puppetdb.pdb_routing$pdb_app$fn__46568 invoke pdb_routing.clj 60]
  [compojure.core$if_context$fn__36613 invoke core.clj 218]
  [ring.middleware.params$wrap_params$fn__37084 invoke params.clj 67]
  [puppetlabs.puppetdb.middleware$wrap_with_certificate_cn$fn__37106 invoke middleware.clj 76]
  [puppetlabs.puppetdb.middleware$wrap_with_default_body$fn__37111 invoke middleware.clj 83]
  [puppetlabs.puppetdb.middleware$wrap_with_debug_logging$fn__37089 invoke middleware.clj 40]
  [puppetlabs.i18n.core$locale_negotiator$fn__272 invoke core.clj 361]
  [puppetlabs.trapperkeeper.services.webserver.jetty9_core$ring_handler$fn__48220 invoke jetty9_core.clj 460]
  [puppetlabs.trapperkeeper.services.webserver.jetty9_core.proxy$org.eclipse.jetty.server.handler.AbstractHandler$ff19274a handle nil -1]
  [org.eclipse.jetty.server.handler.HandlerWrapper handle HandlerWrapper.java 127]
  [org.eclipse.jetty.server.handler.ScopedHandler nextHandle ScopedHandler.java 235]
  [org.eclipse.jetty.server.handler.ContextHandler doHandle ContextHandler.java 1440]
  [org.eclipse.jetty.server.handler.ScopedHandler nextScope ScopedHandler.java 190]
  [org.eclipse.jetty.server.handler.ContextHandler doScope ContextHandler.java 1355]
  [org.eclipse.jetty.server.handler.ScopedHandler handle ScopedHandler.java 141]
  [org.eclipse.jetty.server.handler.ContextHandlerCollection handle ContextHandlerCollection.java 234]
  [org.eclipse.jetty.server.handler.HandlerCollection handle HandlerCollection.java 146]
  [org.eclipse.jetty.server.handler.gzip.GzipHandler handle GzipHandler.java 772]
  [org.eclipse.jetty.server.handler.RequestLogHandler handle RequestLogHandler.java 54]
  [com.puppetlabs.trapperkeeper.services.webserver.jetty9.utils.MDCRequestLogHandler handle MDCRequestLogHandler.java 36]
  [org.eclipse.jetty.server.handler.StatisticsHandler handle StatisticsHandler.java 181]
  [org.eclipse.jetty.server.handler.HandlerWrapper handle HandlerWrapper.java 127]
  [org.eclipse.jetty.server.Server handle Server.java 516]
  [org.eclipse.jetty.server.HttpChannel lambda$handle$1 HttpChannel.java 487]
  [org.eclipse.jetty.server.HttpChannel dispatch HttpChannel.java 732]
  [org.eclipse.jetty.server.HttpChannel handle HttpChannel.java 479]
  [org.eclipse.jetty.server.HttpConnection onFillable HttpConnection.java 277]
  [org.eclipse.jetty.io.AbstractConnection$ReadCallback succeeded AbstractConnection.java 311]
  [org.eclipse.jetty.io.FillInterest fillable FillInterest.java 105]
  [org.eclipse.jetty.io.ChannelEndPoint$1 run ChannelEndPoint.java 104]
  [org.eclipse.jetty.util.thread.strategy.EatWhatYouKill runTask EatWhatYouKill.java 338]
  [org.eclipse.jetty.util.thread.strategy.EatWhatYouKill doProduce EatWhatYouKill.java 315]
  [org.eclipse.jetty.util.thread.strategy.EatWhatYouKill tryProduce EatWhatYouKill.java 173]
  [org.eclipse.jetty.util.thread.strategy.EatWhatYouKill run EatWhatYouKill.java 131]
  [org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread run ReservedThreadExecutor.java 409]
  [org.eclipse.jetty.util.thread.QueuedThreadPool runJob QueuedThreadPool.java 883]
  [org.eclipse.jetty.util.thread.QueuedThreadPool$Runner run QueuedThreadPool.java 1034]
  [java.lang.Thread run Thread.java 833]]}

This is additional information to help troubleshoot the problem.

  • The 500-error only appears when accessing the 'Overview' tab on puppet board. The other tabs show normal log messages, but the 500 error is always present when accessing 'Overview'
  • The Overview page loads correctly about 60% of the time.
  • When the overview page fails, the following message appears:
Internal Server Error
This error usually occurs because:
    We were unable to reach PuppetDB;
    The query to be executed was malformed resulting in an incorrectly encoded request.
Please have a look at the log output for further information.
  • I am not able to trigger the Overview page failure by refreshing or clicking between tabs. The overview page error appears randomly
austb commented

Thanks @jjarokergc, we are going to start looking into this issue. Another user in the community Slack reported hitting this issue with Puppetboard as well.

I have not had the chance to spin up Puppetboard yet, but the user in Slack reported that setting PDB_PROMPTLY_END_QUERIES=false in their systemd unit file made the issue go away. That variable completely disables the query timeout code we just added, so if you set it I would recommend removing it after we release a fix for the bug.

austb commented

The fix for this bug has been released in PuppetDB 8.1.1

Hello @austb,

Thank you. I just installed version 8.1.1 and Puppet board works without error messages in the logs.

Jon