rancher-sandbox/rancher-desktop

Error stopping k8s container on containerd

Opened this issue · 8 comments

Actual Behavior

When using containerd container engine, attempting to stop a container corresponding to a pod displays an error:
image
The same occurs trying to do so manually:

> nerdctl -n k8s.io stop 12dbee4b591a
FATA[0000] 1 errors:
unable to cleanup network for container: 12dbee4b591a

Steps to Reproduce

  1. Start Rancher Desktop with containerd backend with Kubernetes enabled.
  2. Create a pod:

    kubectl create deployment nginx-test --image=nginx:stable

  3. Open the Rancher Desktop main window and navigate to the Containers tab.
  4. In the Namespace drop down near the top right, select k8s.io as the namespace.
  5. Locate the nginx container (not the pause one), and click on the ⋮ button on the right side.
  6. Click on _Stop`.

Result

See the Actual Behaviour section.

Expected Behavior

The container should be stopped. (Kubernetes may end up restarting it.)

Additional Information

Found while testing Qase test case RD-185.

Rancher Desktop Version

1.17.0-hackweek-release-254-g77398647d (1.17.0-RC1)

Rancher Desktop K8s Version

1.31.3

Which container engine are you using?

containerd (nerdctl)

What operating system are you using?

Windows

Operating System / Build Version

Windows 11 Pro 23H2 (Build 22631.4602)

What CPU architecture are you using?

x64

Linux only: what package format did you use to install Rancher Desktop?

None

Windows User Only

No response

This might have the same root cause as containerd/nerdctl#3765

This might have the same root cause as containerd/nerdctl#3765

Almost certainly.

This might have the same root cause as containerd/nerdctl#3765

Confirming the fix here: containerd/nerdctl#3771 does address it in this case would be lovely.

Testing with nerdctl version 2.0.2-20-gadfa1760 still doesn't work:

time="2024-12-17T11:08:01-08:00" level=warning msg="Unable to read network annotation: this container was probably not started with nerdctl.No networking cleanup will be performed, which may likely result in a broken state for the other systems you used to manage these containers.Mixing completely different stacks to manage containers lifecycle is not recommended." error="unexpected end of JSON input"

Testing with nerdctl version 2.0.2-20-gadfa1760 still doesn't work:

time="2024-12-17T11:08:01-08:00" level=warning msg="Unable to read network annotation: this container was probably not started with nerdctl.No networking cleanup will be performed, which may likely result in a broken state for the other systems you used to manage these containers.Mixing completely different stacks to manage containers lifecycle is not recommended." error="unexpected end of JSON input"

@jandubois

This is no longer a hard error. It is now a warning (message is subject to change), and the kill (or stop) should proceed. Can you confirm / infirm?

Thanks!

This is no longer a hard error. It is now a warning (message is subject to change), and the kill (or stop) should proceed.

It may be an issue with the calling code, but I continue to get an error dialog, so it still is a regression from earlier releases:

CleanShot 2024-12-17 at 13 01 32@2x

When I run it from a terminal, the stop seems to work, but the warning is definitely concerning, especially the "unexpected end of JSON input" part:

$ nerdctl -n k8s.io stop 08f71719c7c6
time="2024-12-17T13:16:46-08:00" level=warning msg="Unable to read network annotation: this container was probably not started with nerdctl.No networking cleanup will be performed, which may likely result in a broken state for the other systems you used to manage these containers.Mixing completely different stacks to manage containers lifecycle is not recommended." error="unexpected end of JSON input"
08f71719c7c6

But once stopped, I still cannot remove the container:

$ nerdctl -n k8s.io rm 08f71719c7c6
FATA[0000] 1 errors:
failed to load container networking options from specs: unexpected end of JSON input
Error: exit status 1

$ nerdctl -n k8s.io rm -f 08f71719c7c6
ERRO[0000] 1 errors:
failed to load container networking options from specs: unexpected end of JSON input

@apostasie I think it would be best if nerdctl could take an extra option on stop, kill, rm etc commands that says: "I know you did not create these containers, but please clean them up for me anyways". That way you can get rid of the ugly warning as well, or turn them into real errors when the user didn't provide the --i-know-what-i-am-asking option.

I have no serious suggestions for the option name, unfortunately. --allow-foreign-container or something like that?

On a further note (and this should be discussed in the nerdctl repo), but should nerdctl ps list containers that is cannot manipulate? Or should they be filtered out too, unless you specify the --allow-foreign-containers option? Otherwise it seems a bit inconsistent.

@jandubois moved the nerdctl part of the conversation to the PR.

As for:

It may be an issue with the calling code, but I continue to get an error dialog, so it still is a regression from earlier releases:

I am not familiar with rancher, so, I cannot advise.
It feels bizarre though that a simple warning on stderr would pop a dialog - you might want to look into that here (we might also just downgrade that to an info, so...)