kube-proxy failed to restore iptables
yongxiu opened this issue · 5 comments
kube-proxy version: kube-proxy-amd64:v1.22.8
kube-proxy failed to save iptables rules, the verbose logs are below:
I1225 03:04:45.320763 1 proxier.go:1355] "Opened local port" port="\"nodePort for gke-system/istio-ingress:status-port\" (:31937/tcp4)"
I1225 03:04:45.321514 1 traffic.go:91] [DetectLocalByCIDR (10.240.0.0/13)] Jump Not Local: [-m comment --comment "gke-system/istiod:https-webhook cluster IP" -m tcp -p tcp -d 172.26.172.173/32 --dport 443 ! -s 10.240.0.0/13 -j KUBE-MARK-MASQ]
I1225 03:04:45.321712 1 traffic.go:91] [DetectLocalByCIDR (10.240.0.0/13)] Jump Not Local: [-m comment --comment "cert-manager/cert-manager-webhook:https cluster IP" -m tcp -p tcp -d 172.26.75.7/32 --dport 443 ! -s 10.240.0.0/13 -j KUBE-MARK-MASQ]
I1225 03:04:45.321945 1 traffic.go:91] [DetectLocalByCIDR (10.240.0.0/13)] Jump Not Local: [-m comment --comment "gke-system/istio-ingress:https cluster IP" -m tcp -p tcp -d 172.26.94.145/32 --dport 443 ! -s 10.240.0.0/13 -j KUBE-MARK-MASQ]
I1225 03:04:45.322172 1 proxier.go:1355] "Opened local port" port="\"nodePort for gke-system/istio-ingress:https\" (:30429/tcp4)"
I1225 03:04:45.322334 1 traffic.go:91] [DetectLocalByCIDR (10.240.0.0/13)] Jump Not Local: [-m comment --comment "default/kubernetes:https cluster IP" -m tcp -p tcp -d 172.26.0.1/32 --dport 443 ! -s 10.240.0.0/13 -j KUBE-MARK-MASQ]
I1225 03:04:45.322489 1 traffic.go:91] [DetectLocalByCIDR (10.240.0.0/13)] Jump Not Local: [-m comment --comment "cert-manager/cert-manager:tcp-prometheus-servicemonitor cluster IP" -m tcp -p tcp -d 172.26.147.25/32 --dport 9402 ! -s 10.240.0.0/13 -j KUBE-MARK-MASQ]
I1225 03:04:45.322636 1 traffic.go:91] [DetectLocalByCIDR (10.240.0.0/13)] Jump Not Local: [-m comment --comment "gke-system/istiod:http-monitoring cluster IP" -m tcp -p tcp -d 172.26.172.173/32 --dport 9093 ! -s 10.240.0.0/13 -j KUBE-MARK-MASQ]
I1225 03:04:45.322799 1 traffic.go:91] [DetectLocalByCIDR (10.240.0.0/13)] Jump Not Local: [-m comment --comment "anthos-identity-service/ais:info cluster IP" -m tcp -p tcp -d 172.26.118.2/32 --dport 9901 ! -s 10.240.0.0/13 -j KUBE-MARK-MASQ]
I1225 03:04:45.324067 1 proxier.go:1621] "Restoring iptables" rules=https://gist.github.com/yongxiu/81bfb3f8974e4aba29b4918bbccb3c82
I1225 03:04:45.324871 1 iptables.go:419] running iptables-restore [-w 5 -W 100000 --noflush --counters]
E1225 03:04:45.338528 1 proxier.go:1624] "Failed to execute iptables-restore" err="exit status 1 (iptables-restore: line 398 failed\n)"
I1225 03:04:45.339920 1 proxier.go:1627] "Closing local ports after iptables-restore failure"
iptable rules are in https://gist.github.com/yongxiu/81bfb3f8974e4aba29b4918bbccb3c82
I tried to manually run this rule iptables-restore -w 5 -W 100000 --noflush --counters < rule.txt --verbose
It succeeded only after I commented out 3 lines:
# iptables-restore -w 5 -W 100000 --noflush --counters < rule.txt --verbose
# -X KUBE-SVC-G4P4IPQ4JUEESJSA
# -X KUBE-SVC-A32MGCDFPRQGQDBB
# -X KUBE-SVC-JTXKX5D7NT2O6RLC
I wonder how to debug the root cause?
https://gist.github.com/yongxiu/8e85219ef90d8e8b3042b96374b7a60e is the result of iptables-save
, seems like if there is a existing rule with same name, it will fail
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.