Inconsistencies between qualified names on AWS nodes
rifelpet opened this issue · 2 comments
/kind bug
/kind failing-test
Our grid jobs for RHEL-based distros are failing a test that was recently unskipped for unrelated reasons (#16176)
https://testgrid.k8s.io/kops-grid#kops-grid-cilium-amzn2-k28
[FAIL] [sig-network] Networking Granular Checks: Services [It] should function for service endpoints using hostNetwork
[FAILED] failed dialing endpoint, did not find expected responses...
Tries 46
Command curl -g -q -s 'http://100.96.4.93:9080/dial?request=hostname&protocol=http&host=100.66.81.213&port=80&tries=1'
retrieved map[i-03b17693021906ac2.eu-west-1.compute.internal:{} i-03fbc6f079db37ce7.eu-west-1.compute.internal:{} i-0c90e87e766b90952.eu-west-1.compute.internal:{} i-0fd0d694876a8befc.eu-west-1.compute.internal:{}]
expected map[i-03b17693021906ac2:{} i-03fbc6f079db37ce7:{} i-0c90e87e766b90952:{} i-0fd0d694876a8befc:{}]
This test expects unqualified names but is actually receiving fully qualified names. The test code's expected data comes from the kubernetes.io/hostname
label on nodes (also the node name itself) which we see is the unqualified instance ID.
The test's actual data comes from running the hostname
command on a hostNetwork pod.
A list of our distros and whether hostname
returns a fully qualified name:
- AL 2 - yes
- AL 2023 - yes
- Debian 10 - no
- Debian 12 - no
- Flatcar - no
- RHEL 8 - yes
- RHEL 9 - yes
- Rocky 8 - yes
- Ubuntu 20.04 - no
- Ubuntu 22.04 - no
I think our best path forward would be to configure the RHEL-based distros to return the unqualified name for hostname
. This would match behavior with the other distros.
Alternatively we could make all node names fully qualified like i-03fbc6f079db37ce7.eu-west-1.compute.internal
but this feels more disruptive.
This relates to kubernetes/kubernetes#121018 and the e2e test logic could be updated to handle either qualified or unqualified hostname
outputs.
/kind office-hours