openconfig/public

Lack of visibility of temperature alarm thresholds when not crossed

nkitchen opened this issue · 2 comments

Our customers have expressed concern about the limited information they can get from /components/component/state/temperature/alarm-threshold. It was a surprise to them that they could only see the value of any threshold while the temperature was exceeding it, even though it is clearly stated in the module. They have asked some reasonable questions:

  • How should they know the thresholds corresponding to different severities prior to exceeding them?
  • How can they or (a vendor test engineer) verify that the alarm paths are working?

We can imagine a model addition that could address this, for example:

  • /components/component/state/temperature/thresholds/threshold[severity=*]/state/value

But there are problems with it:

  • State containers are generally supposed to be as close as possible to the leaves of the tree. Right now the state/temperature container is fudging slightly; adding a threshold list to it would be a more blatant violation of the guideline.
  • What about alarms for other sensors besides temperature? Might it be better to define a model that works for thresholds and alarms of different types of sensors, instead of one limited to temperature thresholds? Or should we consider that /components/component/transceiver/thresholds already addresses the other important use cases?

I think this leaf is under specified. There should be an upper and lower threshold for temperature. The state for temperature threshold can be static for a given hardware. Optionally, in some cases a soft-threshold could also be configured to alarm before reaching a hard limit.

Recently upper/lower temperature thresholds were added specifically for transceivers. See:
(https://openconfig.net/projects/models/schemadocs/yangdoc/openconfig-platform.html#components-component-transceiver-thresholds-threshold-state-module-temperature-lower)

Perhaps we should adopt this for the generalized component tree as well. WDYT @nkitchen ?

Recently upper/lower temperature thresholds were added specifically for transceivers. See: (https://openconfig.net/projects/models/schemadocs/yangdoc/openconfig-platform.html#components-component-transceiver-thresholds-threshold-state-module-temperature-lower)

Perhaps we should adopt this for the generalized component tree as well. WDYT @nkitchen ?

So /components/component[name=XYZ]/state/temperature/alarm-status would indicate a crossing of some /components/component[name=XYZ]/thresholds/threshold/state/temperature-upper? That looks to me like it would work.