NewSystemd must not ignore UnitExists
kolyshkin opened this issue · 3 comments
This is the same issue as described in opencontainers/runc#3780
In short, NewSystemd
ignores UnitExists
error from systemd, as a result the PID is not added to the proper systemd unit and thus the cgroup.
See the above issue for details as to how that may happen.
See the opencontainers/runc#3782 as to how this might be fixed.
I'll take a look at this
I assume that this is the block that you're talking about refactoring? In a nutshell, are you proposing the following:
- If you get a unit exists error, reset the systemd unit like here and try again.
- If the retry fails, return an error rather than quietly complete.
I'll see if I can throw together a PR. Seems straightforward enough, if my understanding is correct...
I assume that this is the block that you're talking about refactoring? In a nutshell, are you proposing the following:
- If you get a unit exists error, reset the systemd unit like here and try again.
- If the retry fails, return an error rather than quietly complete.
I'll see if I can throw together a PR. Seems straightforward enough, if my understanding is correct...
@mmerkes yes, your understanding is correct. The most straightforward fix would be to merely return a "unit already exists" error and when fix the callers who do not expect it. The slightly more elaborated one is to call reset-failed (which will remove the failed/non-running unit, if exists) and retry, like I did in runc/libcontainer/cgroups.