Incorrect response size calculation
kdima opened this issue · 6 comments
We are seeing issues with dns responses from skydns server when the response size is around 550 bytes.
Here is a repro case. I am running skydns locally and I have my resolv.conf
modified to point at localhost.
dig some-domain.example.com SRV
; <<>> DiG 9.10.3-P4-Ubuntu <<>> some-domain.example.com SRV
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62612
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 3
;; QUESTION SECTION:
;some-domain.example.com. IN SRV
;; ANSWER SECTION:
some-domain.example.com. 219 IN SRV 10 33 5601 a.some-domain.example.com.
some-domain.example.com. 219 IN SRV 10 33 5601 b.some-domain.example.com.
some-domain.example.com. 219 IN SRV 10 33 5601 c.some-domain.example.com.
;; ADDITIONAL SECTION:
a.some-domain.example.com. 219 IN A 1.2.3.4
b.some-domain.example.com. 273 IN A 1.2.3.5
c.some-domain.example.com. 240 IN A 1.2.3.6
;; Query time: 38 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Feb 13 10:50:54 GMT 2017
;; MSG SIZE rcvd: 568
Dig succeeds and reports message size 568.
If I now use go dns to do the same resolution using this code
package main
import (
"net"
"fmt"
)
func main() {
_, a, err := net.LookupSRV("", "", "some-domain.example.com")
if err != nil {
fmt.Printf("err is %s\n", err.Error())
}
fmt.Printf("res is %+v\n", a)
}
Output is
err is lookup some-domain.example.com on 127.0.0.1:53: read udp 127.0.0.1:44438->127.0.0.1:53: i/o timeout
res is []
On the other hand if I try to lookup something with a larger response i.e. around 800 both dig and go dns work.
I have added some debugging into skydns while investigating a broken lookup and printed out what Msg.Len()
returns when called inside Fit
. It is returning message size of 457 even though when receiving the reply dig
reports message size of 568.
I have tried 2 fixes that both seem to work:
- I have disabled compression inside the
Msg.Len
function - I have added extra 100 bytes to
Msg.Len
reply.
So it looks to me like compression is being incorrectly handled for some reason. Also it looks like these issues only started happening after upstream miekg/dns
got updated and disabled the compression. But this has not been verified.
Here is the tcpdump
12:11:13.284016 IP (tos 0x0, ttl 64, id 38646, offset 0, flags [DF], proto UDP (17), length 101)
ip6-localhost.57863 > ip6-localhost.domain: [bad udp cksum 0xfe64 -> 0x9cb4!] 12312+ SRV? something.example.com. (73)
12:11:13.322520 IP (tos 0x0, ttl 64, id 38651, offset 0, flags [DF], proto UDP (17), length 596)
ip6-localhost.domain > ip6-localhost.57863: [bad udp cksum 0x0054 -> 0xb89e!] 12312* q: SRV? something.example.com. 3/0/3 something.example.com. SRV a-07ecf750d42cbe603.something.example.com.:5601 10 33, something.example.com. SRV a-03bddc860f9a014cc.something.example.com.:5601 10 33, something.example.com. SRV a-0901ac67c31f85903.something.example.com.:5601 10 33 ar: a-07ecf750d42cbe603.something.example.com. A 1.2.49.214, a-03bddc860f9a014cc.something.example.com. A 1.2.54.86, a-0901ac67c31f85903.something.example.com. A 1.2.38.64 (568)
12:11:18.284740 IP (tos 0x0, ttl 64, id 39762, offset 0, flags [DF], proto UDP (17), length 101)
ip6-localhost.48112 > ip6-localhost.domain: [bad udp cksum 0xfe64 -> 0x8721!] 27586+ SRV? something.example.com. (73)
12:11:18.323786 IP (tos 0x0, ttl 64, id 39767, offset 0, flags [DF], proto UDP (17), length 596)
ip6-localhost.domain > ip6-localhost.48112: [bad udp cksum 0x0054 -> 0x2e9e!] 27586* q: SRV? something.example.com. 3/0/3 something.example.com. SRV a-03bddc860f9a014cc.something.example.com.:5601 10 33, something.example.com. SRV a-0901ac67c31f85903.something.example.com.:5601 10 33, something.example.com. SRV a-07ecf750d42cbe603.something.example.com.:5601 10 33 ar: a-03bddc860f9a014cc.something.example.com. A 1.2.54.86, a-0901ac67c31f85903.something.example.com. A 1.2.38.64, a-07ecf750d42cbe603.something.example.com. A 1.2.49.214 (568)
The request was done using go dns.
Here is the same request result using dig
dig some-domain.example.com SRV
; <<>> DiG 9.10.3-P4-Ubuntu <<>> something.example.com SRV
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62612
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 3
;; QUESTION SECTION:
;something.example.com. IN SRV
;; ANSWER SECTION:
something.example.com. 219 IN SRV 10 33 5601 a.something.example.com.
something.example.com. 219 IN SRV 10 33 5601 b.something.example.com.
something.example.com. 219 IN SRV 10 33 5601 c.something.example.com.
;; ADDITIONAL SECTION:
a.something.example.com. 219 IN A 1.2.3.4
b.something.example.com. 273 IN A 1.2.3.5
c.something.example.com. 240 IN A 1.2.3.6
;; Query time: 38 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Feb 13 10:50:54 GMT 2017
;; MSG SIZE rcvd: 568
Ignore the different ips this is me sanitizing the results.
As far as I understand go dns does not accept replies larger than 512
Just to clarify this is not a stubzone lookup.