hassio-addons/addon-grafana

New health check in v9.0.0 always fails

MartijnVdS opened this issue · 4 comments

Problem/Motivation

I just upgraded to v9.0.0 of the Grafana addon, and the container keeps restarting.

Expected behavior

Container starts and continues to run.

Actual behavior

Container restarts, because system thinks it crashed, because health check fails.

Steps to reproduce

  • Install v9.0.0 on a HomeAssistant OS VM
  • Keep seeing "Addon is starting" in the Addon settings screen for this addon

Or:

  • Log in on the console
  • Check with docker ps regularly; see container stays in "starting" mode, and gets re-started regularly.

Proposed changes

The problem seems to occur because the Grafana instance listens on the local IP inside the container (172.30.33.3:1337 in my case), but the health check tries to contact it at 127.0.0.1:1337 which leads to an error during health check, which leads to the service restarting.

The health check log can be found using docker inspect:

            "Health": {
                "Status": "starting",
                "FailingStreak": 1,
                "Log": [
                    {
                        "Start": "2023-07-08T12:34:51.703427263Z",
                        "End": "2023-07-08T12:34:51.762383047Z",
                        "ExitCode": 1,
                        "Output": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\ncurl: (7) Failed to connect to 127.0.0.1 port 1337 after 0 ms: Couldn't connect to server\n"
                    }
                ]
            }

I did some more research:

When I docker exec -ti sh into the container and run curl http://172.30.33.3:1337/api/health manually, I get a HTTP 403 (Forbidden) response and this shows up in the log:

2023/07/08 14:40:12 [error] 444#444: *1 access forbidden by rule, client: 172.30.33.3, server: a0d7b954-grafana, request: "GET /api/health HTTP/1.1", host: "172.30.33.3:1337"

However, the exit code of the curl command is 0 (so the container would be considered "healthy").

Hmz, why didn't I see that locally.. will investigate

G4KCM commented

Same problem after upgrade to V9.0.0

edit…. After about ten minutes of restarting it is now running ok

Observing the same. Container keeps restarting, ... API seems to work though, only UI does not come up .. says "addon is starting"