Handle "broken" Juju models
Closed this issue · 0 comments
przemeklal commented
Currently, if the exporter encounters a model in an error state (e.g. throwing errors on juju status -m model-name
it simply crashes out. Since by default, the exporter tries to access all models in a juju controller, it will fail to export any data, even if all other models are fine.
Example traceback:
Jan 12 14:41:23 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unexpected facade EnvironUpgrader found, unable to decipher version to use
Jan 12 14:41:24 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unknown facade EnvironUpgrader
Jan 12 14:41:24 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unexpected facade EnvironUpgrader found, unable to decipher version to use
Jan 12 14:41:25 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unknown facade EnvironUpgrader
Jan 12 14:41:25 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unexpected facade EnvironUpgrader found, unable to decipher version to use
Jan 12 14:41:27 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unknown facade EnvironUpgrader
Jan 12 14:41:27 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unexpected facade EnvironUpgrader found, unable to decipher version to use
Jan 12 14:41:36 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: 2023-01-12 14:41:36,090 ERROR - Collection job resulted in error: model cache: model "f8dc82ca-ef1b-461b-80d0-36a0f36bb910" did not appear >
Jan 12 14:41:36 juju-f5291e-3-lxd-28 systemd[1]: snap.prometheus-juju-exporter.prometheus-juju-exporter.service: Main process exited, code=exited, status=1/FAILURE
Jan 12 14:41:36 juju-f5291e-3-lxd-28 systemd[1]: snap.prometheus-juju-exporter.prometheus-juju-exporter.service: Failed with result 'exit-code'.
The same model causes issues for the juju client as well:
$ juju status -m openstack
ERROR model cache: model "f8dc82ca-ef1b-461b-80d0-36a0f36bb910" did not appear in cache timeout
The possible workaround is to create a dedicated Juju user with login
access and fine-tune its permissions:
juju grant prometheus-juju-exporter admin controller
juju grant prometheus-juju-exporter admin model1
juju grant prometheus-juju-exporter admin model2
...
Any user with the superuser
access level will try to access all models resulting in the same crash.