tls: failed to verify certificate: x509: certificate is valid for <DOMAIN>, not console.redhat.com
CamZie opened this issue · 6 comments
Describe the bug
After the installation of OKD 4.15 we are getting this error from the insights and authentication operator:
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.15.3 True False True 10h OAuthServerConfigObservationDegraded: failed to apply IDP Login config: tls: failed to verify certificate: x509: certificate is valid for *.apps.oc.domain.tld, *.oc.domain.tld, oc.domain.tld, not login.domain.tld
insights 4.15.3 False False True 2d23h Unable to report: unable to build request to connect to Insights server: Post "https://console.redhat.com/api/ingress/v1/upload": tls: failed to verify certificate: x509: certificate is valid for *.apps.oc.domain.tld, *.oc.domain.tld, oc.domain.tld, not console.redhat.com
Version
ClusterVersion: Stable at "4.15.3"
Log bundle
ClusterVersion: Stable at "4.15.3"
ClusterOperators:
clusteroperator/authentication is degraded because OAuthServerConfigObservationDegraded: failed to apply IDP Login config: tls: failed to verify certificate: x509: certificate is valid for *.apps.oc.domain.tld, *.oc.domain.tld, oc.domain.tld, not login.domain.tld
clusteroperator/insights is not available (Unable to report: unable to build request to connect to Insights server: Post "https://console.redhat.com/api/ingress/v1/upload": tls: failed to verify certificate: x509: certificate is valid for *.apps.oc.domain.tld, *.oc.domain.tld, oc.domain.tld, not console.redhat.com) because Unable to report: unable to build request to connect to Insights server: Post "https://console.redhat.com/api/ingress/v1/upload": tls: failed to verify certificate: x509: certificate is valid for *.apps.oc.domain.tld, *.oc.domain.tld, oc.domain.tld, not console.redhat.com
Does anyone have an idea what could be the issue?
The log says:
[tls] certificate is valid for *.apps.oc.domain.tld, *.oc.domain.tld, oc.domain.tld, not login.domain.tld
So you have to generate a new SSL certificate that includes login.domain.tld
@codespearhead thanks for the tip. However "login.domain.tld" is the URL of our SSO. This is running on an external server with its own Let's Encrypt certificate.
I think this issue is most probably an issue with how the domain is being resolved, because we are getting the following errors in the DNS:
dns-default-9x8qn linux/amd64, go1.20.12 X:strictfipsruntime,
dns-default-9x8qn [INFO] 10.128.0.100:39274 - 64042 "A IN login.domain.tld.oc.domain.tld. udp 63 false 1232" - - 0 6.002317855s
dns-default-9x8qn [ERROR] plugin/errors: 2 login.domain.tld.oc.domain.tld. A: read udp 10.128.0.39:46929->9.9.9.9:53: i/o timeout
dns-default-9x8qn [INFO] 10.128.0.121:32824 - 32981 "A IN infogw.api.openshift.com.oc.domain.tld. udp 74 false 1232" - - 0 6.001218683s
dns-default-9x8qn [ERROR] plugin/errors: 2 infogw.api.openshift.com.oc.domain.tld. A: read udp 10.128.0.39:39464->9.9.9.9:53: i/o timeout
dns-default-9x8qn [INFO] 10.128.0.47:53644 - 40718 "A IN api.oc.domain.tld. udp 53 false 1232" - - 0 6.002321304s
dns-default-9x8qn [ERROR] plugin/errors: 2 api.oc.domain.tld. A: read udp 10.128.0.39:38144->9.9.9.9:53: i/o timeout
Somehow whenever it checks an external FQDN for .e.g infogw.api.openshift.com
/ login.domain.tld
it checks the following instead infogw.api.openshift.com.oc.domain.tld
/ login.domain.tld.oc.domain.tld
which is always appending the base domain of our cluster at the end.
Show us what base domain is set in your cluster.
- Log into the cluster:
oc login --server=<your-cluster-api-url> -u <username> -p <password>
- Output its base domain:
oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}'
This is the output of the command. The domain.tld
is used to replace the real domain.
$ oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}'
apps.oc.domain.tld
I managed to find out the cause. It looks like the /etc/resolv.conf
of the host has the search
parameter configured, which is the reason why the base domain of the cluster is always appending on every DNS queries on the cluster.
# Generated by NetworkManager
###search oc.domain.tld
nameserver .....
I removed this parameter and it works.
Nice!
Can you close this issue then?