This project can handle DNS caching just for the docker containers or for the whole instance. The caching application is CoreDNS with a very simple CoreFile.
Table of Contents
AMI | Tested | Status |
---|---|---|
ECS Amazon Linux | Yes | Works |
ECS Amazon Linux 2 | Yes | Works |
Variables | Description | Example / Default |
---|---|---|
DNS_UPSTREAM | Your Upstream DNS to forward to if the request is not in cache. | 172.31.0.2 |
DNS_PROMETHEUS_PORT | What port you want the metrics exposed on. E.G. [PORT]:/metrics . |
9153 |
DNS_HEALTH_PORT | What port you want the health check exposed on. E.G. [PORT]:/health . When CoreDNS is up and running this returns a 200 OK http status code. |
8080 |
DNS_CACHE_TIME | TTL in seconds for DNS cache. | 120 |
DNS_CACHE_PREFETCH | Prefetch will prefetch popular items when they are about to be expunged from the cache. | 10 |
DNS_CACHE_SUCCESS | The maximum number of packets CoreDNS caches before we start evicting (randomly). | 5000 |
DNS_CACHE_DENIAL | The maximum number of packets CoreDNS caches before we start evicting (LRU). | 2500 |
Variables | Description | Example / Default |
---|---|---|
INTERVAL | How long between each health check interval. In seconds. | 5 |
RETRIES | Max number of failures before health check fails. | 5 |
START_PERIOD | How long to wait until the health check starts. In seconds. | 5 |
TIMEOUT | How long the health check should wait before timing out. In seconds. | 5 |
HEALTHCHECK_PORT | Prefetch will prefetch popular items when they are about to be expunged from the cache. This should match DNS_HEALTH_PORT |
8080 |
APPLICATION_NAME | Name of the application to display in the log | CoreDNS |
Please make sure to change the place-holder Upstream DNS server IP, 172.31.0.2, with your own.
- Run ecs-local-dns-cache-bridge.taskdef.json as a Daemon service.
See example-usage-bridge.taskdef.json for a real-world example of the following.
-
Set the "dnsServers" parameter for your containers in your task definition(s) to:
169.254.20.10 172.31.0.2
Work in progress.
Verify it is working on the EC2 instance.
$ for run in {1..2}; do sleep 1; docker run -it \
--dns 169.254.20.10 \
--dns 172.31.0.2 \
busybox nslookup -type=a -debug ecs.aws; \
done
- Work in progress.
You should now see a success hit:
$ curl -s localhost:9153/metrics | grep 'coredns_cache_hits_total{server="dns://:53",type="success"}'
coredns_cache_hits_total{server="dns://:53",type="success"} 1
CoreDNS provides many metrics to understand cache hit / miss and if CoreDNS maybe lagging behind. Take a look at the file METRICS.md
for a list of entries.
Early test numbers before any type of tune has been performed. This test was also ran on the same EC2 instance (t3a.small).
$ resperf -s 169.254.20.10 -d queryfile-example-10million-201202
DNS Resolution Performance Testing Tool
Version 2.2.1
[Status] Command line: resperf -s 169.254.20.10 -d queryfile-example-10million-201202
[Status] Sending
[Status] Reached 65536 outstanding queries
[Status] Waiting for more responses
[Status] Testing complete
Statistics:
Queries sent: 80258
Queries completed: 23533
Queries lost: 56725
Response codes: NOERROR 14003 (59.50%), SERVFAIL 4855 (20.63%), NXDOMAIN 4675 (19.87%)
Run time (s): 54.813746
Maximum throughput: 4200.000000 qps
Lost at that point: 69.45%
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
67489f818fb2 ecs-ecs-local-dns-cache-24-dns-cache-healthcheck-c0e2bd86dcb7a0cede01 0.09% 1.07MiB / 1.916GiB 0.05% 0B / 0B 229kB / 0B 2
40875d6040a1 ecs-ecs-local-dns-cache-24-dns-cache-d896beb292dff8f9b901 24.64% 15.47MiB / 1.916GiB 0.79% 0B / 0B 28.8MB / 0B 11
d15d11835b24 ecs-agent 0.17% 12.87MiB / 1.916GiB 0.66% 0B / 0B 46.3MB / 13.1MB 13
-
Q: Does this work for ECS Bridge Mode?
- A: This is all that tested it on at the moment. AWS VPC network mode is currently being worked on.
-
Q: Will this work for Fargate?
- A: Not tested, but until the DNS option is supported in Network Mode: AWS VPC, a different method is needed.