@mslacken @e4t Could you check whether this FAQ entry is correct?
Current version
|
<para> What is the difference between the state <literal>down</literal> |
|
and <literal>down*</literal>? </para> |
|
</question> |
|
<answer> |
|
<para> A <literal>*</literal> shown after a status code means that the |
|
node is not responding. </para> |
|
<para> Thus, when a node is marked as <literal>down*</literal>, this means |
|
that the node is not reachable due to network issues, or its |
|
<literal>slurmd</literal> is not running. </para> |
|
<para> In the <literal>down</literal> state, the node is reachable, but |
|
the node was rebooted unexpectedly, the hardware does not match the |
|
description in <filename>slurm.conf</filename>, or a healthcheck was |
|
configured with the <literal>HealthCheckProgram</literal>. </para> |
Original version which explained "down" twice
|
<para>What is the difference between the state <literal>down</literal> and <literal>down*</literal>?</para> |
|
</question> |
|
<answer> |
|
<para> |
|
When a node is marked as <literal>down</literal> this means that the node is not reachable due to network issues or the <literal>slurmd</literal> is not running. In the <literal>down</literal> state the node is reachable, but the node was rebooted unexpectedly, the hardware does not match the description in <filename>slurm.conf</filename> or a healthcheck configured with the <literal>HealthCheckProgram</literal>. |
|
</para> |
|
</answer> |