Lua library containing a dns client, several utilities, and a load-balancer.
The module is currently OpenResty only, and builds on top of the
lua-resty-dns
library
- resolves A, AAAA, CNAME and SRV records, including port
- parses
/etc/hosts
- parses
/resolv.conf
and appliesLOCALDOMAIN
andRES_OPTIONS
variables - caches dns query results in memory
- synchronizes requests (a single request for many requestors, eg. when cached ttl expires under heavy load)
toip
applies a local (weighted) round-robin scheme on the query results- (weighted) round-robin balancer
- consistent-hashing balancer
- least-connections balancer
Copyright: (c) 2016-2021 Kong, Inc.
Author: Thijs Schreijer
License: Apache 2.0
Revision history:
- Feat: support run in stream subsystem. PR 1
Tests are executed using busted
, but because they run inside the resty
cli tool, you must
use the rbusted
script.
For troubleshooting purposes: see the /extra
folder for how to parse logs
Versioning is strictly based on Semantic Versioning
Release process:
- update the changelog below
- update the rockspec file
- generate the docs using
ldoc .
- commit and tag the release
- upload rock to LuaRocks
- Fix:
validTtl
should not be used for host-file entries. PR 134
- Performance: reduce amount of timers on init_worker. PR 130
- BREAKING: the round-robin balancing algorithm is now implemented in
resty.dns.round_robin
and has no consistent-hashing algorithm features, which are now all restricted toresty.dns.consistent_hashing
. PR 123 - Added: option to enable or disable nameservers randomization. Note: this feature depends on the to-be-released feature in lua-resty-dns. PR 119
- Fix: potential synchronisation issue in the least-connections balancer. PR 126
- Fix: do not iterate over all the search domains when resolving an unambiguous fully-qualified domain name (FQDN), i.e. ended in a dot. PR 122
- Fix: balancer DNS updates could go into a busy loop upon renewal. Reported as Kong issue #6739, fixed with PR 116.
- Fix: now a single timer is used to check for expired records instead of one per host, significantly reducing the number of resources required for DNS resolution. PR 112
- Dependency: Bump lua-resty-timer to 1.0
- Fix: workaround for LuaJIT/ARM bug, see Issue 93.
- Fix: table reduction was calculated wrong. Not a "functional" bug, just causing slightly less agressive memory releasing.
- Added: alternative implementation of the consistent-hashing balancing algorithm, which does not rely on the addresses addition and removal order to build the same request distribution among different instances. See PR 97.
- BREAKING:
getPeer
now returns the host-header value instead of the hostname that was used to add the address. This is only breaking if a host was added throughaddHost
with an ip-address. In that casegetPeer
will no longer return the ip-address as the hostname, but will now returnnil
. See PR 89. - Added: option
useSRVname
, if truthy thengetPeer
will return the name as found in the SRV record, instead of the hostname as added to the balancer. See PR 89. - Added: callback return an extra parameter; the host-header for the address added/removed.
- Fix: using the module instance instead of the passed one for dns resolution in the balancer (only affected testing). See PR 88.
- Change: export DNS source type on status report. See PR 86.
- Fix: fix ttl-0 records issues with the balancer, see Kong issue
Kong/kong#5477
- the previous record was not properly detected as a ttl=0 record
by checking on the
__ttl0flag
we now do - since the "fake" SRV record wasn't updated with a new expiry time the expiry-check-timer would keep updating that record every second
- the previous record was not properly detected as a ttl=0 record
by checking on the
- Fix: added logging of try-list to the TCP/UDP wrappers, see PR 75.
- Fix: reduce logging noise of the requery timer
- Fix: unhealthy balancers would not recover because they would not refresh the DNS records used. See PR 73.
- Added: automatic background resolving of hostnames, expiry will be checked every second, and if needed DNS (and balancer) will be updated. See PR 73.
- BREAKING: the balancer callback is called with a new event; "health" whenever the health status of the balancer changes.
- BREAKING: renamed
setPeerStatus
tosetAddressStatus
to be in line with the newsetHostStatus
, and prevent confusion. - Added: keep track of unavailable weight. Added the
getStatus
method to return health, of the entire balancer structure. Health itself is determined based on the new propertyhealthThreshold
. - Added: prevention of cascading failures when balancer is unhealthy. Use the
healthThreshold
value to set when the balancer is considered unhealthy. - Added: method
setHostStatus
, to set the availability/health state of all addresses belonging to a host at once. - Fix: when an asyncquery failed to create the timer, it would silently ignore the error. Error is now being logged.
- Fix: callback for adding an address did not pass the address object, but instead passed the balancer object twice.
- Fix: "balancer is nil" error, see issue #49.
- Refactor: split the balancer in a base class (handling DNS resolution) and the ring-balancer, implementing the algorithm.
- Added: new least-connections balancer
- Fix: since addresses could occasionally hold names instead of IP addresses,
it could happen that a call to
setPeerStatus
was unsuccessful, because the IP address would not match the name in theaddress
object. Now ahandle
is returned bygetPeer
. - BREAKING:
getPeer
signature (and return values) changed, making this a breaking change.
- Added: a new option
validTtl
that, if set, will forcefully override thettl
value of any valid answer received. Issue 48. - Fix: remove multiline log entries, now encoded as single-line json. Issue 52.
- Fix: always inject a
localhost
value, even if not in/etc/hosts
. Issue 54. - Fix: added a workaround for Amazon Route 53 nameservers replying with a
ttl=0
whilst the record has a non-0 ttl. Issue 56.
- Fix: the round-robin scheme for the balancer starts at a randomized position to prevent all workers from starting with the same peer.
- Fix: the balancer no longer returns
port = 0
for SRV records without a port, the default port is now returned. - Fix: ipv6 nameservers with a scope in their address are not supported. This fix will simply skip them instead of throwing errors upon resolving. Fixes issue 43.
- Minor: improved logging in the balancer
- Minor: relax requery default interval for failed dns queries from 1 to 30 seconds.
- BREAKING: improved performance and memory footprint for large balancers.
80-85% less memory will be used, while creation time dropped by 85-90%. Since
the
host:getPeer()
function signature changed, this is a breaking change. - Change: BREAKING the errors for cache-only lookup failures and empty records have been changed.
- Fix: do not fail initialization without nameservers.
- Fix: properly recognize IPv6 in square brackets from the /etc/hosts file.
- Fix: do not set success-type to types we're not looking for. Fixes Kong issue #3210.
- Fix: store records from the additional section in cache
- Fix: do not overwrite stale data in the client cache with empty records
- Change: BREAKING all IPv6 addresses are now returned with square brackets
- Fix: properly recognize IPv6 addresses in square brackets
- Added: flag to mark an address as failed/unhealthy, see
setPeerStatus
- Added: callback to receive balancer updates; addresses added-to/removed-from the balancer (after DNS updates for example).
- fix: SRV record entries with a weight 0 are now supported
- fix: failure of the last hostname to resolve (balancer)
- Fix: balancer not returning hostname for named SRV entries. See issue #17
- Fix: fix an occasionally failing test
- Refactor: remove metadata from the records, instead store it in its own cache
- Change: use a different randomizer for the ring-balancer to predictably
recreate the balancer in the exact same state (adds the
lrandom
library as a new dependency)
- Added: resolution will be done async whenever possible. For this to work a new
setting has been introduced
staleTtl
which determines for how long stale records will returned while a query is in progress in the background. - Change: BREAKING! several functions that previously returned and took a resolver object no longer do so.
- Fix: no longer lookup ip adresses as names if the query type is not A or AAAA
- Fix: normalize names to lowercase after query
- Fix: set last-success types for hosts-file entries and ip-addresses
- Removed: BREAKING! stdError function removed.
- Added: implemented the
search
andndots
options. - Change:
resolve
no longer returns empty results or dns errors as a table but as lua errors (nil + error
). - Change:
toip()
andresolve()
have an extra result; history. A table with the list of tried names/types/results. - Fix: timeout and retrans options from
resolv.conf
were ignored by theclient
module. - Fix: nameservers with an ipv6 address would not be used properly. Also
added a flag
enable_ipv6
(default ==false
) to enable the usage of ipv6 nameservers.
- Fix: cname record caching causing excessive dns queries, see Kong issue #2303.
- Change: BREAKING! modified hash treatment, must now be an integer > 0
- Added: BREAKING! a retry counter to fall-through on hashed-retries (changes
the
getpeer
signature) - Fix: the MAXNS (3) was not honoured, so more than 3 nameservers would be parsed
from the
resolv.conf
file. Fixes Kong issue #2290. - Added: two convenience hash functions
- Performance: some improvements (pre-allocated tables for the slot lists)
- Fix: Cleanup disabled addresses but did not delete them, causing errors when they were repeatedly added/removed
- Fix: potential racecondition when re-querying dns records
- Fix: potential memoryleak when a balancer object was released with a running timer
- Kubernetes dns returns an SRV record for individual nodes, where the target is the same name again (hence causing a recursive loop). Now those entries will be removed, and if nothing is left, it will fail the SRV lookup, causing a fall-through to the next record type.
- Kubernetes tends to return a port of 0 if none is provided/set, hence the
toip()
function now ignores aport=0
and falls back on the port passed in.
- breaking: renamed a lot of things; method names, module names, etc. pretty much breaks everything... also releasing under a new name
- feature: udp function
setpeername
added (client) - fix: do not synchronize dns queries for ttl=0 requests (client)
- fix: full test coverage and accompanying fixes (ring-balancer)
- feature: auto-retry for failed dns queries (ring-balancer)
- feature: updating weights is now supported without removing/re-adding (ring-balancer)
- change: auto-retry interval configurable for failed dns queries (ring-balancer)
- change: max life-time interval configurable for ttl=0 dns records (ring-balancer)
- fix:
toip()
failed on SRV records with only 1 entry
- fix: was creating resolver objects even if serving from cache
- change: change resolver order (SRV is now first by default) for dns servers that create both SRV and A records for each entry
- feature: make resolver order configurable
- feature: ring-balancer (experimental, no full test coverage yet)
- other: more test coverage for the dns client