skynetservices/skydns

NXDOMAIN Redirection causing intermittent issues

Opened this issue · 0 comments

We are using OpenShift version 3.11 and I'm pretty certain we're using SkyDNS.

The problem we are experiencing is that our corporate DNS server ties into a different DNS further up that performs NXDOMAIN Redirection to an advertising site instead of giving us an NXDOMAIN response. Therefore, we are seeing a failure like:

failed to create volume: Post http://heketi-storage.glusterfs.svc:8080/volumes: dial tcp 92.242.140.68:8080: i/o timeout

The 92,242.140.68 IP is the advertising site.

Apparently SkyDNS depends on an NXDOMAIN response in order to append .cluster.local.

Examples:
[root@appnode2~]# nslookup heketi-storage.glusterfs.svc
Server: 10.x.x.x
Address: 10.x.x.x#53

Non-authoritative answer:
Name: heketi-storage.glusterfs.svc
Address: 92.242.140.68

[root@appnode2 ~]# cat /etc/resolv.conf
# nameserver updated by /etc/NetworkManager/dispatcher.d/99-origin-dns.sh
# Generated by NetworkManager
search cluster.local corp.company.com

And so, since the server is not getting the NXDOMAIN response, it doesn't append .cluster.local as per the /etc/resolv.conf file

If we add .cluster.local to the request, it resolves correctly:

[root@appnode2~]$ nslookup heketi-storage.glusterfs.svc.cluster.local
Server: 10.x.x.x
Address: 10.x.x.x#53

Name: heketi-storage.glusterfs.svc.cluster.local
Address: 10.x.x.x < correct internal IP