ysde/grafana-backup-tool

grafana-backup doesn't restore alerts

Opened this issue · 9 comments

we use grafana-backup-tool ver 1.2.4 for backing up our grafana.
once we tried to restore the data on other server, all settings were restored successfully besides the alerts.
all alert rules, channels and notification tools weren't restored. the alerts page in Grafana UI was empty.

Grafana release 9.4 where you can export alert rules through API. Hope they upgrade this backup tool to support these.

Could you confirm with the latest version that this works for you @LibiKorol?

Hi, I used the latest ver 1.3.1 with Grafana 9.5.2, all alerts were restored properly. however, the contact points and notification policies weren't restored.

mt3593 commented

I use this tool as a sync across a few cluster, so for contract points (slack in my case) and notification policies I tend to set this up per environment as they can be different.
If we do add this capability it would be good to have the option of omitting certain types from the backup/restore

The restore of rules is not working for me. Using version 1.3.3 with Grafana 9.5.1.

I dug into this and in create_alert_rule.py it is calling a function to get_alert_rule. If that returns 404 it does a create_alert_rule otherwise it does an update_alert_rule.

In my case it is returning 500 not 404. The Grafana documentation indicates it should return a 404 so I don't know why I would be getting a 500.

Getting alert rule
[DEBUG] resp status: 500
[DEBUG] resp body: {'message': 'could not find alert rule', 'traceID': ''}
|--Got a code: 500

On the update rule code I had to add the x-disable-provenance header to get it to work.

I'm not sure why I am seeing this behavior. I'm wondering if it is because my Grafana instance is running in Kubernetes cluster and my accessing it through a LoadBalancer though I'm struggling to understand why it would matter.

@derekmceachern I got the same result as yours and I'm running it in Kubernetes.

[DEBUG] resp status: 500
[DEBUG] resp body: {'message': 'could not find alert rule', 'traceID': ''}
mt3593 commented

smells like a grafana bug, it should be a 404.
Has anyone raise this on the grafana board yet? or is this only in kubernetes?

@mt3593, Finally had a chance to look into this some more after your comment and it turns out this has been fixed.

It took me a while to find the associated pull request but here it is: grafana/grafana#67331

According to the labels it was back ported into v9.4.x and v9.5.x and was merged into the 10.0.0 version.

I upgraded to 10.0.1 in our Kubernetes environment and I'm able to confirm that this is fixed. Here is the log snippet from my logs.

restoring alert_rule: /tmp/tmpezl_xm9x/_OUTPUT_/alert_rules/202311282012/b6e4b89e-5bdd-4ac1-ac19-0659420c4e67.alert_rule
===========================================================================
Getting alert rule
[DEBUG] resp status: 404
[DEBUG] resp body: None
|--Got a code: 404
Alert rule does not exist, creating

So, I would suggest that this item can be closed.

Hi , everyone , I am still error when restoring alerts ( tried multiple envs , this is a fresh grafana:latest container with just 1 alert rule for testing )
`
restoring alert_rule: /tmp/tmpr9ms7hv9/OUTPUT/alert_rules/202409301300/adzglchghr18gf.alert_rule
[DEBUG] resp status: 404
[DEBUG] resp body:
return complexjson.loads(self.text, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/json/init.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/bin/grafana-backup", line 8, in
sys.exit(main())
^^^^^^
File "/usr/lib/python3.11/site-packages/grafana_backup/cli.py", line 55, in main
restore(args, settings)
File "/usr/lib/python3.11/site-packages/grafana_backup/restore.py", line 107, in main
restore_components(args, settings, restore_functions, tmpdir)
File "/usr/lib/python3.11/site-packages/grafana_backup/restore.py", line 145, in restore_components
restore_functions[ext](args, settings, file_path)
File "/usr/lib/python3.11/site-packages/grafana_backup/create_alert_rule.py", line 32, in main
get_response= get_alert_rule(uid, grafana_url, http_get_headers, verify_ssl, client_cert, debug)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/grafana_backup/dashboardApi.py", line 207, in get_alert_rule
return send_grafana_get(url, http_get_headers, verify_ssl, client_cert, debug)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/grafana_backup/dashboardApi.py", line 515, in send_grafana_get
return (r.status_code, r.json())
^^^^^^^^
File "/usr/lib/python3.11/site-packages/requests/models.py", line 975, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
`

looks like grafana is returning empty response on that endpoint (tested with curl) nothing in grafana logs
grafana-1 | logger=context userId=2 orgId=1 uname=sa-1-backup t=2024-09-30T13:19:19.365337832Z level=info msg="Request Completed" method=GET path=/api/v1/provisioning/alert-rules/adzglchghr18gf status=404 remote_addr=redacted time_ms=9 duration=9.322269ms size=0 referer= handler=/api/v1/provisioning/alert-rules/:UID status_source=server

but anyways this seems like a grafana issue , just wanted to ask if anyone else is facing this issue or if I am doing something worng,