kubernetes/kubernetes

Status manager does not normalize ephemeral container statuses

hshiina opened this issue · 6 comments

What happened?

EphemeralContainerStatuses is not normalized in normalizeStatus() in the status manager:

// update container statuses
for i := range status.ContainerStatuses {
cstatus := &status.ContainerStatuses[i]
normalizeContainerState(&cstatus.State)
normalizeContainerState(&cstatus.LastTerminationState)
}
// Sort the container statuses, so that the order won't affect the result of comparison
sort.Sort(kubetypes.SortedContainerStatuses(status.ContainerStatuses))
// update init container statuses
for i := range status.InitContainerStatuses {
cstatus := &status.InitContainerStatuses[i]
normalizeContainerState(&cstatus.State)
normalizeContainerState(&cstatus.LastTerminationState)
}
// Sort the container statuses, so that the order won't affect the result of comparison
kubetypes.SortInitContainerStatuses(pod, status.InitContainerStatuses)
return status

This does not seem to cause any user facing problem because EphemeralContainerStatuses is sorted before passed to the status manager and timestamps are normalized at marshaling to create a patch:

sort.Sort(kubetypes.SortedContainerStatuses(containerStatuses))

newPod, patchBytes, unchanged, err := statusutil.PatchPodStatus(context.TODO(), m.kubeClient, pod.Namespace, pod.Name, pod.UID, pod.Status, mergedStatus)

However, an unexpected behavior is caused internally because timestamps are not normalized. If the log verbosity is 3 or larger, the following messages are logged periodically in kubelet when an ephemeral container exists:

May 14 20:10:33 kind-control-plane kubelet[720]: I0514 20:10:33.437155 	720 status_manager.go:230] "Syncing all statuses"
May 14 20:10:33 kind-control-plane kubelet[720]: I0514 20:10:33.438512 	720 status_manager.go:984] "Pod status is inconsistent with cached status for pod, a reconciliation should be triggered" pod="default/ephemeral-demo" statusDiff=<
May 14 20:10:33 kind-control-plane kubelet[720]:     	  &v1.PodStatus{
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	... // 8 identical fields
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	PodIPs:            	{{IP: "10.244.0.5"}},
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	StartTime:         	s"2024-05-14 20:09:21 +0000 UTC",
May 14 20:10:33 kind-control-plane kubelet[720]:     	-     	InitContainerStatuses: nil,
May 14 20:10:33 kind-control-plane kubelet[720]:     	+     	InitContainerStatuses: []v1.ContainerStatus{},
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	ContainerStatuses: 	{{Name: "ephemeral-demo", State: {Running: &{StartedAt: {Time: s"2024-05-14 20:09:23 +0000 UTC"}}}, Ready: true, Image: "registry.k8s.io/pause:3.1", ...}},
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	QOSClass:          	"BestEffort",
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	EphemeralContainerStatuses: []v1.ContainerStatus{
May 14 20:10:33 kind-control-plane kubelet[720]:     	              	{
May 14 20:10:33 kind-control-plane kubelet[720]:     	                      	Name: "debugger-qk87g",
May 14 20:10:33 kind-control-plane kubelet[720]:     	                      	State: v1.ContainerState{
May 14 20:10:33 kind-control-plane kubelet[720]:     	                              	Waiting:	nil,
May 14 20:10:33 kind-control-plane kubelet[720]:     	-                             	Running:	&v1.ContainerStateRunning{StartedAt: v1.Time{Time: s"2024-05-14 20:09:35 +0000 UTC"}},
May 14 20:10:33 kind-control-plane kubelet[720]:     	+                             	Running:	&v1.ContainerStateRunning{StartedAt: v1.Time{Time: s"2024-05-14 20:09:35.697513091 +0000 UTC"}},
May 14 20:10:33 kind-control-plane kubelet[720]:     	                              	Terminated: nil,
May 14 20:10:33 kind-control-plane kubelet[720]:     	                      	},
May 14 20:10:33 kind-control-plane kubelet[720]:     	                      	LastTerminationState: {},
May 14 20:10:33 kind-control-plane kubelet[720]:     	                      	Ready:            	false,
May 14 20:10:33 kind-control-plane kubelet[720]:     	                      	... // 7 identical fields
May 14 20:10:33 kind-control-plane kubelet[720]:     	              	},
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	},
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	Resize:            	"",
May 14 20:10:33 kind-control-plane kubelet[720]:     	      	ResourceClaimStatuses: nil,
May 14 20:10:33 kind-control-plane kubelet[720]:     	  }
May 14 20:10:33 kind-control-plane kubelet[720]:  >

This message is emitted here:

klog.V(3).InfoS("Pod status is inconsistent with cached status for pod, a reconciliation should be triggered",
"pod", klog.KObj(pod),
"statusDiff", cmp.Diff(podStatus, &status))
return true

After needsReconcile() returns true, syncPod() is called. This does not look so harmful because unchaged gets true eventually:

newPod, patchBytes, unchanged, err := statusutil.PatchPodStatus(context.TODO(), m.kubeClient, pod.Namespace, pod.Name, pod.UID, pod.Status, mergedStatus)
klog.V(3).InfoS("Patch status for pod", "pod", klog.KObj(pod), "podUID", uid, "patch", string(patchBytes))
if err != nil {
klog.InfoS("Failed to update status for pod", "pod", klog.KObj(pod), "err", err)
return
}
if unchanged {
klog.V(3).InfoS("Status for pod is up-to-date", "pod", klog.KObj(pod), "statusVersion", status.version)

API is called once unnecessarily in syncPod(), though:

pod, err := m.kubeClient.CoreV1().Pods(status.podNamespace).Get(context.TODO(), status.podName, metav1.GetOptions{})

What did you expect to happen?

It might be better to normalize ephemeral container statuses, at least timestamps, in the status manager to avoid this unexpected behavior.

How can we reproduce it (as minimally and precisely as possible)?

Set the log verbosity of kubelet to 3 or larger. Then, create an ephemeral container like this.

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
# paste output here
Client Version: v1.29.4
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2

Cloud provider

None

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

/sig node

@hshiina This is a bug. Can you submit a PR to fix it? If not, I'd be happy to work on it

@zhifei92
If you are interested, please feel free to assign yourself.