atc0005/check-cert

Audit plugin exit states to determine if another state (e.g., UNKNOWN) would be more applicable

atc0005 opened this issue · 1 comments

Overview

While implementing performance data handling as part of #445 I found myself choosing to do one of:

  • emitting an error and continuing
  • exiting with an UNKNOWN status

Either seemed more appropriate than exiting with either of CRITICAL or WARNING state.

To keep mostly current logic intact I opted to just emit the error as a log message and continue:

	pd, perfDataErr := getPerfData(certChain, cfg.AgeCritical, cfg.AgeWarning)
	if perfDataErr != nil {
		log.Error().
			Err(perfDataErr).
			Msg("failed to generate performance data")

		// Surface the error in plugin output.
		plugin.AddError(perfDataErr)

		// TODO: Abort plugin execution with UNKNOWN status?
	}

	if err := plugin.AddPerfData(false, pd...); err != nil {
		log.Error().
			Err(err).
			Msg("failed to add performance data")

		// Surface the error in plugin output.
		plugin.AddError(err)

		// TODO: Abort plugin execution with UNKNOWN status?
	}

I'm filing this GH issue as a reminder to revisit that decision in the larger context of current plugin logic as it pertains to the Nagios Plugin Guidelines (see refs):

Numeric Value Service Status Status Description
0 OK The plugin was able to check the service and it appeared to be functioning properly
1 Warning The plugin was able to check the service, but it appeared to be above some "warning" threshold or did not appear to be working properly
2 Critical The plugin detected that either the service was not running or it was above some "critical" threshold
3 Unknown Invalid command line arguments were supplied to the plugin or low-level failures internal to the plugin (such as unable to fork, or open a tcp socket) that prevent it from performing the specified operation. Higher-level errors (such as name resolution errors, socket timeouts, etc) are outside of the control of plugins and should generally NOT be reported as UNKNOWN states.

References

Considering this resolved per the work on: