Ringpop discovery churn due to DNS truncation
Closed this issue · 1 comments
Is your feature request related to a problem? Please describe.
We are using Cadence with DNS ringpop and after scaling the frontend to more than 3 instances we are seeing a significant number of log lines such as:
Add new peers by DNS lookup
and Remove stale peers by DNS lookup
Proposed Solution
We believe this is due to DNS truncation of the now larger responses. Prior to Go 1.19 the net package would truncate responses that were larger than 512bytes. Upgrading the Go version to >= 1.19 should fix this problem.
The pure Go resolver will now use EDNS(0) to include a suggested maximum reply packet length, permitting reply packets to contain up to 1232 bytes (the previous maximum was 512).
Additional context
Example of DNS truncation
> dig cadence.service.consul +noedns +short
10.0.0.216
10.0.0.145
10.0.0.79
> dig cadence.service.consul +short
10.0.0.145
10.0.0.216
10.0.0.233
10.0.0.79
10.0.0.17
10.0.0.185
Logs details
{
"service": "cadence-frontend",
"message": "Add new peers by DNS lookup",
"attributes": {
"addresses": "[10.0.0.145:7833]",
"address": "cadence.service.consul",
"level": "info",
"service": "cadence-frontend",
"logging-call-at": "dns_updater.go:80",
}
{
"service": "cadence-frontend",
"message": "Remove stale peers by DNS lookup",
"attributes": {
"addresses": "[10.0.0.17:7833]",
"address": "cadence.service.consul",
"level": "info",
"service": "cadence-frontend",
"logging-call-at": "dns_updater.go:80",
}
Thanks!
🤦 I just realized the recent builds are on a newer version of Golang. Closing!