voxpupuli/puppetboard

Stop using the default / completely random SECRET_KEY value

TwizzyDizzy opened this issue · 9 comments

(Original issue title: Query sometimes returns results, sometimes not)

Describe the bug

Query page sometimes shows a result, sometimes does not, even when sending the same query several times.

Javascript console doesn't show obvious errors... it is known to occur in several OS/Browser scenarios (Windows, OS X, Chromium, Firefox).

What I did notice though: in cases that do not return a result, the PuppetDB access log shows one query, while in cases that show a result, the PuppetDB access log shows two queries:

Not showing results

0.0.0.0 - - [21/Oct/2022:11:53:39 +0200] "GET /pdb/query/v4/environments HTTP/1.1" 200 273 "-" "puppetboard/4.1.1 (r/2.28.1)" 3 -

Showing results

0.0.0.0 - - [21/Oct/2022:11:53:49 +0200] "GET /pdb/query/v4/environments HTTP/1.1" 200 273 "-" "puppetboard/4.1.1 (r/2.28.1)" 3 -
0.0.0.0 - - [21/Oct/2022:11:53:49 +0200] "GET /pdb/query/v4?query=environments%7B%7D HTTP/1.1" 200 273 "-" "puppetboard/4.1.1 (r/2.28.1)" 3 -

Should be easily reproducable by simply sending the query environments{} multiple times in a row... if nobody can reproduce this, it must be an error on our side, then I'd need to dig deeper on my side... so any feedback is appreciated.

Puppetboard version

4.1.1

Environment and installation method

  • CentOS 8 Stream, Python 3.9
  • Installed via PyPI (but not via the Puppetboard Puppet module)

settings.py

      import os

      PUPPETDB_HOST = '%{facts.networking.fqdn}'
      PUPPETDB_PORT = 8081
      PUPPETDB_SSL_VERIFY = '/etc/puppetlabs/puppet/ssl/certs/ca.pem'
      PUPPETDB_KEY = '/opt/puppetboard-venv/private.pem'
      PUPPETDB_CERT = '/opt/puppetboard-venv/public.pem'
      PUPPETDB_TIMEOUT = 20
      DEFAULT_ENVIRONMENT = '*'
      SECRET_KEY = os.urandom(24)
      DEV_LISTEN_HOST = '127.0.0.1'
      DEV_LISTEN_PORT = 5000
      DEV_COFFEE_LOCATION = 'coffee'
      UNRESPONSIVE_HOURS = 24
      ENABLE_QUERY = True
      LOCALISE_TIMESTAMP = True
      LOGLEVEL = 'debug'
      NORMAL_TABLE_COUNT = 250
      LITTLE_TABLE_COUNT = 50
      TABLE_COUNT_SELECTOR = [10, 25, 50, 100, 250, 500, 1000]
      OFFLINE_MODE = True
      ENABLE_CATALOG = True
      OVERVIEW_FILTER = None
      GRAPH_TYPE = 'donut'
      FAVORITE_ENVS = ['production','test','development']
      GRAPH_FACTS = ['architecture',
                     'clientversion',
                     'domain',
[...]
                     'sudoversion',
                     'tier' ]
      INVENTORY_FACTS = [('Hostname', 'fqdn'),
                         ('IP Address', 'ipaddress'),
                         ('OS', 'operatingsystem'),
                         ('OS-Release', 'operatingsystemrelease'),
                         ('Architecture', 'hardwaremodel'),
                         ('Kernel Version', 'kernelrelease'),
                         ('Puppet Version', 'puppetversion'), ]
      REFRESH_RATE = 30
      DAILY_REPORTS_CHART_ENABLED = True
      DAILY_REPORTS_CHART_DAYS = 8
      SHOW_ERROR_AS = 'friendly'
      CODE_PREFIX_TO_REMOVE = '/etc/puppetlabs/code/environments'

Hi, thanks for reporting this @TwizzyDizzy!

Can you please try generating a random string once and setting it in your config under SECRET_KEY?

I think that our default os.urandom(24) is wrong as it generates a different value for each app process and probably you are running 2+ processes of the app with uWSGI, so when you make a query request from a page generated by one process and it goes to another process, the secrets mismatch..

That sounds entirely possible! Indeed I have two WSGI processes running.

Also read up on what that static key is actually about (CSRF prevention) and after reading it, your answer makes sense even more.

Thanks for the hint, will give it a shot and report back tomorrow.

Cheers
Thomas

I have set the SECRET_KEY statically and now both processes use the same key, hence the error disappeared.

Indeed, I'd say that the following default is... at least misleading and should be commented upon accordingly:

Want a PR for that?

Cheers
Thomas

Glad that it helped!

A PR would be nice, but I am not sure how to deal with this in a best way. I am afraid that a comment might go unnoticed. Perhaps assuming that the Puppetboard is usually not set up in public we should hardcode a default value? 🤔

I am afraid that a comment might go unnoticed.

Yep, also thought about this and you're probably right. But then again, setting the same SECRET_KEY everywhere would solve the immediate problem but would also be "insecure by default" (I know, very limited in scope, but nonetheless) because then probably nobody will adjust the default either. Catch 22 really.

You could automatically generate it, but then you would run into the problem that you would need to generate a (locally) stable key across (local) instances [1] which kinda works against the randomness requirement.

So the options are:

  • have a default that would break when running more than one WSGI process
  • have a static key which people wouldn't touch and then everybody uses the same static key
  • have a locally stable auto-generated value, which defeats the goal of randomness

I guess the minimal impact really is to document it and keep the os.urandom(24) as a default and document properly, that you need to set this statically if you're running > 1 WSGI processes.

[1] imagine a scenario where you want to have a redundant Puppetboard setup with different load balancer backends on different hosts. Then all instances on ONE instance would have a stable SECRET_KEY, but another backend will have different ones leading, again, to the same issue but one abstraction level above :D

Cheers
Thomas

Yes, I’ve had a similar thought process. :)

But I realized that the current default is also bad even with a single process of the app. It generates a problem like yours when the app restarts. And restarts may be happening often with harakiri in uWSGI or if the app is running in k8s.

Perhaps we should remove the default value and require everyone to set their own secret? The only downside is that technically it would be a backward-incompatible change, so we should uptick the app’s major version and it feels wrong with such a minor change..

Maybe better docs and an appropriate warning in the logs if the default is used? We could use a semi-random default, f.e. with a „default-” prefix which would trigger the warning.

But I realized that the current default is also bad even with a single process of the app. It generates a problem like yours when the app restarts. And restarts may be happening often with harakiri in uWSGI or if the app is running in k8s.

Indeed, didn't think of that.

Perhaps we should remove the default value and require everyone to set their own secret?

Yeah, thought about that, too, though it seemed a bit too unwelcoming to new users. But in the end it's the most sane solution, all things considered.

The only downside is that technically it would be a backward-incompatible change, so we should uptick the app’s major version and it feels wrong with such a minor change..

It is breaking. And if the rule is to major-bump on backwards-incompatible changes, that is how it needs to be, despite of any feeling (even though I share it). Sometimes a big initial error (in hindsight, I mean - putting the os.urandom there in the first place) can only be changed by a big, corrective action (major-bump, breaking change).

Maybe better docs and an appropriate warning in the logs if the default is used? We could use a semi-random default, f.e. with a „default-” prefix which would trigger the warning.

That seems like a half-baked solution in order to not to make a proper decision. Docs, yes, but adding additional code to work around a breaking change seems undecisive. After all, reading the changelogs of software you use should not be optional :)

Cheers
Thomas

Closing. Has been concluded with the release of 5.0.0. (https://github.com/voxpupuli/puppetboard/tree/v5.0.0). The default SECRET_KEY is now empty, forcing the user to set it explicitly.

Cheers
Thomas