StefanSchubert/sabi

Infrastructure Monitor Problem

StefanSchubert opened this issue · 2 comments

Describe the bug
During the past 4 months we had so much engery / internet provide problems we hadn't had the past 3 years together.
2 times fiber cable cutted through nearby construction works, causing 6 and 12h outages
2 times power supply outages. Construction works and a young famer takling a power pylon with his tractor. Each 4h outage.

Running the service homebase on raspberry pis seems to have major drawbacks.
But as long the userbase is not significant I won't invest the costs to move into the cloud.

Instead the first counter measures would be to increase monitoring, so I can better react. The latest power outage seemed to have harmed the SD card on one of the pis causing the service to be broken.

Currently I have configures Uptrends to look for the frontend server (on http 200) and give me an email.
However this does not check the middleware pi as well.

Gool is to implement a healthcheckend point which could be used by uptrends and wich also checks the middleware and the responsiveness of the database too. The middleware already has a healthcheck endpoint which includes a DB test. But this endpoint is reachable only on the local network. Do we need to call it from the frontend?

Meanwhile we have included spring actuator, thus:

GET http://localhost:8080/sabi/actuator/health

results in

{
  "status": "UP",
  "components": {
    "db": {
      "status": "UP",
      "details": {
        "database": "MariaDB",
        "validationQuery": "isValid()"
      }
    },
    "diskSpace": {
      "status": "UP",
      "details": {
        "total": 62529126400,
        "free": 53843865600,
        "threshold": 10485760,
        "exists": true
      }
    },
    "hazelcast": {
      "status": "UP",
      "details": {
        "name": "sabi-hzCache-Instance",
        "uuid": "4cb07219-18a6-41b1-9b5b-39c368076305"
      }
    },
    "mail": {
      "status": "UP",
      "details": {
        "location": "smtp.strato.de:587"
      }
    },
    "ping": {
      "status": "UP"
    }
  }
}

Due to the registration workflow, the middleware API is also available (and should be as we want to enable native mobile clients as well). So including the check e.g. by Uptrends is already possible.

For Info: Uptrends freeplan supports only one check object.