KillingSpark/rustysd

Stop commands might get ignored if the unit is currently restarting

KillingSpark opened this issue · 1 comments

The restart routine currently naively uses deactivate + activate to restart a unit. That leaves a small timewindow where the unit has the status 'Stopped(StoppedFinal)'. If a stop command that tries to deactivate that unit looks at it in this timeframe, it will ignore that unit, even though it will be started immediatly after.

This is a basic example of what could happen: (T1 is reactivating the unit, T2 is trying to just deactivate it)

status change T1 T2
running -> stoppedfinal deactivate
none, is already stopped deactivate
stoppedfinal -> running activate

In general the Unit should not be restarted if this kind of overlap happens, it is very likely the intended effect was to deactivate the unit. This is also very likely to be the result of a service exiting unexpectedly and restarting while a control command was issued to stop this service (or it was stopped due to another unit stopping and not being restarted). In any case I can think of it is either not correct or unexpected for the unit to still be running.

There are a few ways to fix this:

  1. move the status changing logic out of Unit::(de)activate and make status change a wrapper around Unit::(de)activate, something like Unit::(de)(re)activate_with_status. Here an additional Restarting as a toplevel status would be sufficient to communicate to other threads what this unit is currently doing.
  2. make deactivate take an additional argument that determins the state of the unit after it is finished deactivating. This would mean there is an additional Status like Stopped(Restarting).

I currently prefer the first solution but I will think more about this.

Kinda did the first solution with commit b71d3b0 . There is now a restarting status and an extra reactivation routine that takes care of properly updating the status.