unioslo/nivlheim

Try to avoid calling panic in the Go code

Closed this issue · 1 comments

The status API is already reporting jobs that fail. It's ok for that code to panic, since it's caught/recovered and reported.

In other parts of the code, however, panic would cause the whole system service to crash and be restarted.

It would be better to log the error and have the status API report it.

  • Have the status api show a list of the latest errors and warnings.
  • This should be easily machine-parseable for monitoring purposes.

It turns out almost all of the panic() calls are done from places where it will be recovered and handled.
The exception is taskrunner.go. It may panic if the sql query fails, and that will crash the program.

  • However, another point here is that if the database disconnects or goes down, the program must attempt to reconnect. Currently, the only way is to restart the program. And the only way that happens is if taskrunner.go panics. Remember that systemd will automatically restart the service.
    So, the code actually works as desired as-is.

As for logging/monitoring, the status API already shows error messages for panic() that is caught. Panic messages are also written to stdout/err and end up in the systemd log.

The issue can be closed, no changes are needed.