voxpupuli/puppetboard

Intermittent errors after PuppetDB 8.0 to 8.1 upgrade

Closed this issue · 13 comments

ic248 commented

Describe the bug

I had a working version of puppetboard until I upgraded from Puppet Server 8.1 to Puppet 8.2 & PuppetDB 8.0 to 8.1. Now I've upgraded I'm getting intermittent page errors (pretty much ever other page load). Sometimes the error manifests like below and other times it just shows a blank page (still with menus). When it isn't showing the error below or a blank page it seems to display the correct information.

Internal Server Error

This error usually occurs because:

We were unable to reach PuppetDB;
The query to be executed was malformed resulting in an incorrectly encoded request.
Please have a look at the log output for further information.

In the httpd logs I get the following

[Wed Aug 30 09:37:39.574405 2023] [wsgi:error] [pid 210871] [remote *IP*:59285] ERROR:pypuppetdb.api.base: 127.0.0.1:8081 over HTTPS.
[Wed Aug 30 09:37:39.623394 2023] [wsgi:error] [pid 210871] [remote *IP*:59285] ERROR:puppetboard.core:Exception on / [GET]
[Wed Aug 30 09:37:39.623417 2023] [wsgi:error] [pid 210871] [remote *IP*:59285] Traceback (most recent call last):
[Wed Aug 30 09:37:39.623451 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]   File "/srv/puppetboard/virtenv-puppetboard/lib64/python3.6/site-packages/flask/app.py", line 2447, in wsgi_app
[Wed Aug 30 09:37:39.623457 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]     response = self.full_dispatch_request()
[Wed Aug 30 09:37:39.623460 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]   File "/srv/puppetboard/virtenv-puppetboard/lib64/python3.6/site-packages/flask/app.py", line 1952, in full_dispatch_request
[Wed Aug 30 09:37:39.623470 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]     rv = self.handle_user_exception(e)
[Wed Aug 30 09:37:39.623473 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]   File "/srv/puppetboard/virtenv-puppetboard/lib64/python3.6/site-packages/flask/app.py", line 1821, in handle_user_exception
[Wed Aug 30 09:37:39.623476 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]     reraise(exc_type, exc_value, tb)
[Wed Aug 30 09:37:39.623478 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]   File "/srv/puppetboard/virtenv-puppetboard/lib64/python3.6/site-packages/flask/_compat.py", line 39, in reraise
[Wed Aug 30 09:37:39.623481 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]     raise value
[Wed Aug 30 09:37:39.623484 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]   File "/srv/puppetboard/virtenv-puppetboard/lib64/python3.6/site-packages/flask/app.py", line 1950, in full_dispatch_request
[Wed Aug 30 09:37:39.623487 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]     rv = self.dispatch_request()
[Wed Aug 30 09:37:39.623490 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]   File "/srv/puppetboard/virtenv-puppetboard/lib64/python3.6/site-packages/flask/app.py", line 1936, in dispatch_request
[Wed Aug 30 09:37:39.623493 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]     return self.view_functions[rule.endpoint](**req.view_args)
[Wed Aug 30 09:37:39.623496 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]   File "/srv/puppetboard/virtenv-puppetboard/lib64/python3.6/site-packages/puppetboard/views/index.py", line 102, in index
[Wed Aug 30 09:37:39.623499 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]     for node in nodes:
[Wed Aug 30 09:37:39.623501 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]   File "/srv/puppetboard/virtenv-puppetboard/lib64/python3.6/site-packages/pypuppetdb/api/query.py", line 61, in nodes
[Wed Aug 30 09:37:39.623512 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]     summarize_by='certname',
[Wed Aug 30 09:37:39.623515 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]   File "/srv/puppetboard/virtenv-puppetboard/lib64/python3.6/site-packages/pypuppetdb/api/base.py", line 363, in _query
[Wed Aug 30 09:37:39.623518 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]     return self._make_request(url, request_method, payload)
[Wed Aug 30 09:37:39.623528 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]   File "/srv/puppetboard/virtenv-puppetboard/lib64/python3.6/site-packages/pypuppetdb/api/base.py", line 397, in _make_request
[Wed Aug 30 09:37:39.623531 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]     r.raise_for_status()
[Wed Aug 30 09:37:39.623544 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]   File "/srv/puppetboard/virtenv-puppetboard/lib64/python3.6/site-packages/requests/models.py", line 960, in raise_for_status
[Wed Aug 30 09:37:39.623547 2023] [wsgi:error] [pid 210871] [remote *IP*:59285]     raise HTTPError(http_error_msg, response=self)
[Wed Aug 30 09:37:39.623551 2023] [wsgi:error] [pid 210871] [remote *IP*:59285] requests.exceptions.HTTPError: 500 Server Error: Server Error for url: https://127.0.0.1:8081/pdb/query/v4/event-counts?query=%5B%22%3D%22%2C+%22latest_report%3F%22%2C+true%5D&summarize_by=certname

Puppetboard version

v9.0.0

Environment and installation method

  • RHEL 8 and Python 3.6
  • installed with the Forge module v9.0.0 and using the PyPI package (default).
        'puppetboard':
            python_version => '3.6',
            manage_git => true,
            manage_virtualenv => true,
            puppetdb_host => '127.0.0.1',
            puppetdb_port => 8081,
            puppetdb_key => $facts['extlib__puppet_config']['main']['hostprivkey'],
            puppetdb_ssl_verify => $facts['extlib__puppet_config']['main']['localca'],
            puppetdb_cert => $facts['extlib__puppet_config']['main']['hostcert'],
            groups => 'puppet',
            offline_mode => true,
            enable_catalog => true,
            default_environment => $default_environment;

We also have some Puppet config to manage apache along with certs and access via mellon.

Please say if you require any more info.

Thanks
Ian

Same here with a fresh installation.

I've had the same errors (and downgraded to keep the dashboard running).
It seems the PuppetDB API returns HTTP/500 (Internal Server Error).

I don’t have a PuppetDB 8.x to test. But perhaps it’s a PuppetDB bug and should be reported to Puppetlabs?

ic248 commented

I'm not sure if it's the way puppetboard calls the PuppetDB API or just that there is an issue with the PuppetDB API itself. If we could confirm that it's the PuppetDB API and someone could give me an example of the request they made I'll see if I can report it to Puppetlabs. Looking at the release notes for 8.1.0 it doesn't mention any PuppetDB API updates but that's not to say they haven't changed things or a bug has been introduced https://www.puppet.com/docs/puppetdb/8/release_notes#new-features-and-improvements

See the last line in your attached log:

[Wed Aug 30 09:37:39.623551 2023] [wsgi:error] [pid 210871] [remote 129.12.1.25:59285] requests.exceptions.HTTPError: 500 Server Error: Server Error for url: https://127.0.0.1:8081/pdb/query/v4/event-counts?query=%5B%22%3D%22%2C+%22latest_report%3F%22%2C+true%5D&summarize_by=certname

ic248 commented

So interestingly I've queried the PuppetDB API via curl and it responds the same in 8.1.0 as in 8.0.1 and can't get it to give me a 500 error like puppetboard gets. Will dig a bit further either after lunch or on Monday.

curl -G 'https://127.0.0.1:8081/pdb/query/v4/event-counts?query=%5B%22%3D%22%2C+%22latest_report%3F%22%2C+true%5D&summarize_by=certname' --tlsv1 --cacert /etc/puppetlabs/puppet/ssl/certs/ca.pem --cert /etc/puppetlabs/puppet/ssl/certs/*SERVERNAME*.pem --key /etc/puppetlabs/puppet/ssl/private_keys/*SERVERNAME*.pem
[{"failures":0,"skips":0,"successes":0,"noops":3,"subject_type":"certname","subject":{"title":"*SERVERNAME*"}}]
absld commented

Same problem here. PuppetBoard throws a HTTP 500 error. The following error message can be found in PuppetDB logfile but I'm not able to figure out whats the exact problem here.

2023-09-11T09:55:13.316+02:00 ERROR [p.p.middleware] #error {
:cause nil
:via
[{:type java.nio.channels.CancelledKeyException
:message nil
:at [sun.nio.ch.SelectionKeyImpl ensureValid SelectionKeyImpl.java 71]}]
:trace
[[sun.nio.ch.SelectionKeyImpl ensureValid SelectionKeyImpl.java 71]
[sun.nio.ch.SelectionKeyImpl interestOps SelectionKeyImpl.java 96]
[java.nio.channels.spi.AbstractSelectableChannel register AbstractSelectableChannel.java 217]
[java.nio.channels.SelectableChannel register SelectableChannel.java 260]
[jdk.internal.reflect.NativeMethodAccessorImpl invoke0 NativeMethodAccessorImpl.java -2]
[jdk.internal.reflect.NativeMethodAccessorImpl invoke NativeMethodAccessorImpl.java 62]
[jdk.internal.reflect.DelegatingMethodAccessorImpl invoke DelegatingMethodAccessorImpl.java 43]
[java.lang.reflect.Method invoke Method.java 566]
[clojure.lang.Reflector invokeMatchingMethod Reflector.java 167]
[clojure.lang.Reflector invokeInstanceMethod Reflector.java 102]
[puppetlabs.puppetdb.query.monitor$stop_query_at_deadline_or_disconnect invokeStatic monitor.clj 328]
[puppetlabs.puppetdb.query.monitor$stop_query_at_deadline_or_disconnect invoke monitor.clj 324]
[puppetlabs.puppetdb.http.query$wrap_typical_query$fn__31076$fn__31078 invoke query.clj 434]
[puppetlabs.puppetdb.http.query$wrap_typical_query$fn__31076 invoke query.clj 404]
[bidi.ring$fn__35985 invokeStatic ring.cljc 25]
[bidi.ring$fn__35985 invoke ring.cljc 21]
[bidi.ring$fn__35970$G__35965__35979 invoke ring.cljc 16]
[puppetlabs.puppetdb.middleware$fn__37279$make_pdb_handler__37288$fn__37291$fn__37293 invoke middleware.clj 429]
[puppetlabs.puppetdb.middleware$wrap_with_illegal_argument_catch$fn__37121 invoke middleware.clj 100]
[puppetlabs.puppetdb.middleware$verify_accepts_content_type$fn__37133 invoke middleware.clj 148]
[puppetlabs.puppetdb.middleware$verify_content_type$fn__37145 invoke middleware.clj 181]
[puppetlabs.puppetdb.middleware$verify_sync_version$fn__37229 invoke middleware.clj 363]
[puppetlabs.puppetdb.middleware$wrap_with_metrics$fn__37190$fn__37199 invoke middleware.clj 281]
[puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529$fn__26530 invoke metrics.clj 23]
[puppetlabs.puppetdb.utils.metrics.proxy$java.lang.Object$Callable$7da976d4 call nil -1]
[com.codahale.metrics.Timer time Timer.java 101]
[puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529 invoke metrics.clj 23]
[puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529$fn__26530 invoke metrics.clj 23]
[puppetlabs.puppetdb.utils.metrics.proxy$java.lang.Object$Callable$7da976d4 call nil -1]
[com.codahale.metrics.Timer time Timer.java 101]
[puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529 invoke metrics.clj 23]
[puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529$fn__26530 invoke metrics.clj 23]
[puppetlabs.puppetdb.utils.metrics.proxy$java.lang.Object$Callable$7da976d4 call nil -1]
[com.codahale.metrics.Timer time Timer.java 101]
[puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529 invoke metrics.clj 23]
[puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529$fn__26530 invoke metrics.clj 23]
[puppetlabs.puppetdb.utils.metrics.proxy$java.lang.Object$Callable$7da976d4 call nil -1]
[com.codahale.metrics.Timer time Timer.java 101]
[puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_$fn__26528$fn__26529 invoke metrics.clj 23]
[puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_ invokeStatic metrics.clj 26]
[puppetlabs.puppetdb.utils.metrics$multitime_BANG__STAR_ invoke metrics.clj 15]
[puppetlabs.puppetdb.middleware$wrap_with_metrics$fn__37190 invoke middleware.clj 280]
[puppetlabs.puppetdb.middleware$wrap_with_globals$fn__37116 invoke middleware.clj 94]
[puppetlabs.puppetdb.middleware$wrap_with_exception_handling$fn__37125 invoke middleware.clj 115]
[puppetlabs.puppetdb.http.server$build_app$fn__42708 invoke server.clj 80]
[compojure.core$routing$fn__36589 invoke core.clj 151]
[clojure.core$some invokeStatic core.clj 2718]
[clojure.core$some invoke core.clj 2709]
[compojure.core$routing invokeStatic core.clj 151]
[compojure.core$routing doInvoke core.clj 148]
[clojure.lang.RestFn invoke RestFn.java 423]
[puppetlabs.puppetdb.pdb_routing$wrap_with_context$fn__46557 invoke pdb_routing.clj 32]
[compojure.core$if_context$fn__36613 invoke core.clj 218]
[compojure.core$routing$fn__36589 invoke core.clj 151]
[clojure.core$some invokeStatic core.clj 2718]
[clojure.core$some invoke core.clj 2709]
[compojure.core$routing invokeStatic core.clj 151]
[compojure.core$routing doInvoke core.clj 148]
[clojure.lang.RestFn applyTo RestFn.java 139]
[clojure.core$apply invokeStatic core.clj 669]
[clojure.core$apply invoke core.clj 662]
[compojure.core$routes$fn__36593 invoke core.clj 156]
[compojure.core$routing$fn__36589 invoke core.clj 151]
[clojure.core$some invokeStatic core.clj 2718]
[clojure.core$some invoke core.clj 2709]
[compojure.core$routing invokeStatic core.clj 151]
[compojure.core$routing doInvoke core.clj 148]
[clojure.lang.RestFn invoke RestFn.java 460]
[puppetlabs.puppetdb.pdb_routing$pdb_app$fn__46568 invoke pdb_routing.clj 60]
[compojure.core$if_context$fn__36613 invoke core.clj 218]
[ring.middleware.params$wrap_params$fn__37084 invoke params.clj 67]
[puppetlabs.puppetdb.middleware$wrap_with_certificate_cn$fn__37106 invoke middleware.clj 76]
[puppetlabs.puppetdb.middleware$wrap_with_default_body$fn__37111 invoke middleware.clj 83]
[puppetlabs.puppetdb.middleware$wrap_with_debug_logging$fn__37089 invoke middleware.clj 40]
[puppetlabs.i18n.core$locale_negotiator$fn__272 invoke core.clj 361]
[puppetlabs.trapperkeeper.services.webserver.jetty9_core$ring_handler$fn__48220 invoke jetty9_core.clj 460]
[puppetlabs.trapperkeeper.services.webserver.jetty9_core.proxy$org.eclipse.jetty.server.handler.AbstractHandler$ff19274a handle nil -1]
[org.eclipse.jetty.server.handler.HandlerWrapper handle HandlerWrapper.java 127]
[org.eclipse.jetty.server.handler.ScopedHandler nextHandle ScopedHandler.java 235]
[org.eclipse.jetty.server.handler.ContextHandler doHandle ContextHandler.java 1440]
[org.eclipse.jetty.server.handler.ScopedHandler nextScope ScopedHandler.java 190]
[org.eclipse.jetty.server.handler.ContextHandler doScope ContextHandler.java 1355]
[org.eclipse.jetty.server.handler.ScopedHandler handle ScopedHandler.java 141]
[org.eclipse.jetty.server.handler.ContextHandlerCollection handle ContextHandlerCollection.java 234]
[org.eclipse.jetty.server.handler.HandlerCollection handle HandlerCollection.java 146]
[org.eclipse.jetty.server.handler.gzip.GzipHandler handle GzipHandler.java 772]
[org.eclipse.jetty.server.handler.RequestLogHandler handle RequestLogHandler.java 54]
[com.puppetlabs.trapperkeeper.services.webserver.jetty9.utils.MDCRequestLogHandler handle MDCRequestLogHandler.java 36]
[org.eclipse.jetty.server.handler.StatisticsHandler handle StatisticsHandler.java 181]
[org.eclipse.jetty.server.handler.HandlerWrapper handle HandlerWrapper.java 127]
[org.eclipse.jetty.server.Server handle Server.java 516]
[org.eclipse.jetty.server.HttpChannel lambda$handle$1 HttpChannel.java 487]
[org.eclipse.jetty.server.HttpChannel dispatch HttpChannel.java 732]
[org.eclipse.jetty.server.HttpChannel handle HttpChannel.java 479]
[org.eclipse.jetty.server.HttpConnection onFillable HttpConnection.java 277]
[org.eclipse.jetty.io.AbstractConnection$ReadCallback succeeded AbstractConnection.java 311]
[org.eclipse.jetty.io.FillInterest fillable FillInterest.java 105]
[org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint onFillable SslConnection.java 555]
[org.eclipse.jetty.io.ssl.SslConnection onFillable SslConnection.java 410]
[org.eclipse.jetty.io.ssl.SslConnection$2 succeeded SslConnection.java 164]
[org.eclipse.jetty.io.FillInterest fillable FillInterest.java 105]
[org.eclipse.jetty.io.ChannelEndPoint$1 run ChannelEndPoint.java 104]
[org.eclipse.jetty.util.thread.strategy.EatWhatYouKill runTask EatWhatYouKill.java 338]
[org.eclipse.jetty.util.thread.strategy.EatWhatYouKill doProduce EatWhatYouKill.java 315]
[org.eclipse.jetty.util.thread.strategy.EatWhatYouKill tryProduce EatWhatYouKill.java 173]
[org.eclipse.jetty.util.thread.strategy.EatWhatYouKill run EatWhatYouKill.java 131]
[org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread run ReservedThreadExecutor.java 409]
[org.eclipse.jetty.util.thread.QueuedThreadPool runJob QueuedThreadPool.java 883]
[org.eclipse.jetty.util.thread.QueuedThreadPool$Runner run QueuedThreadPool.java 1034]
[java.lang.Thread run Thread.java 834]]}

If I reload PuppetBoard some times, suddenly it works and is able to display the contents of PuppetDB as expected, but after another reload, I stops working again. No problems with querying the PuppetDB with curl.

I'm also not able to downgrade PuppetDB back to 8.0.1-1focal as I'm getting the following error afterwards...

2023-09-11T10:01:40.397+02:00 ERROR [p.t.internal] Error during service start!!!
java.lang.IllegalStateException: Your PuppetDB database contains a schema migration numbered 84, but this version of PuppetDB does not recognize that version.

ic248 commented

@absld running the pupeptdb service in debug mode by stopping the service and the running puppetdb foreground --debug I can see the same error coming up when I get the errors. Since we can't reproduce this error using curl it does make me wonder if puppetboard leaves a connection up or reuses a key which Puppetdb isn't happy about.
Either way we're no closer to saying if it's a puppetdb or puppetboard issue

Hello,
same error here with PuppetDB 8.1.0.
I see these errors in my puppetboard logs:

[Wed Sep 13 11:35:05.040546 2023] [wsgi:error] [pid 1501:tid 139888435332672] [client 172.30.200.79:55828] ERROR:pypuppetdb.api.base: localhost:8080 over HTTP., referer: http://myserver.mydomain.it/
[Wed Sep 13 11:35:05.040694 2023] [wsgi:error] [pid 1501:tid 139888435332672] [client 172.30.200.79:55828] ERROR:puppetboard.utils:500 Server Error: Server Error for url: http://localhost:8080/pdb/query/v4/reports?query=%5B%22extract%22%2C+%5B%5B%22function%22%2C+%22count%22%5D%2C+%22status%22%5D%2C+%5B%22and%22%2C+%5B%22%3E%3D%22%2C+%22producer_timestamp%22%2C+%222023-09-07T00%3A00%3A00.000000Z%22%5D%2C+%5B%22%3C%22%2C+%22producer_timestamp%22%2C+%222023-09-08T00%3A00%3A00.000000Z%22%5D%2C+%5B%22%3D%22%2C+%22environment%22%2C+%22production%22%5D%5D%2C+%5B%22group_by%22%2C+%22status%22%5D%5D, referer: http://myserver.mydomain.it/
[Wed Sep 13 11:35:05.045684 2023] [wsgi:error] [pid 1501:tid 139888435332672] [client 172.30.200.79:55828] ERROR:pypuppetdb.api.base: localhost:8080 over HTTP., referer: http://myserver.mydomain.it/
[Wed Sep 13 11:35:05.045799 2023] [wsgi:error] [pid 1501:tid 139888435332672] [client 172.30.200.79:55828] ERROR:puppetboard.utils:500 Server Error: Server Error for url: http://localhost:8080/pdb/query/v4/environments, referer: http://myserver.mydomain.it/
[Wed Sep 13 11:35:35.798437 2023] [wsgi:error] [pid 1501:tid 139888673166912] [client 172.30.200.79:55883] ERROR:pypuppetdb.api.base: localhost:8080 over HTTP.
[Wed Sep 13 11:35:35.798546 2023] [wsgi:error] [pid 1501:tid 139888673166912] [client 172.30.200.79:55883] ERROR:puppetboard.utils:500 Server Error: Server Error for url: http://localhost:8080/pdb/query/v4/nodes?query=%5B%22extract%22%2C+%5B%5B%22function%22%2C+%22count%22%5D%5D%2C+%5B%22and%22%2C+%5B%22%3D%22%2C+%22catalog_environment%22%2C+%22production%22%5D%5D%5D
[Wed Sep 13 11:35:35.802563 2023] [wsgi:error] [pid 1501:tid 139888673166912] [client 172.30.200.79:55883] ERROR:pypuppetdb.api.base: localhost:8080 over HTTP.
[Wed Sep 13 11:35:35.802664 2023] [wsgi:error] [pid 1501:tid 139888673166912] [client 172.30.200.79:55883] ERROR:puppetboard.utils:500 Server Error: Server Error for url: http://localhost:8080/pdb/query/v4/environments

It seems pypuppetdb.api.base and puppetboard.utils throw exceptions but only with PuppetDB 8.1.0

It seems that upgrade to PuppetDB v8.1.1 resolve the issue

Yes looks like it was resolved here puppetlabs/puppetdb#3866
Will upgrade my dev env and then close if all is happy

Resolved by upgrading to PuppetDB 8.1.1

Thanks to @ic248 and everyone here for reporting the issue and your patience and to @jjarokergc for reporting the bug to Puppetlabs!

I added info about the known bad version of PuppetDB to the README.