beyla is overloading the external dns server with frequent ipv6 reverse lookups
esara opened this issue · 5 comments
after upgrading from beyla v1.4 to beyla v1.5 or later, there is a significant dns lookups generated by beyla (instead of 100 dns lookup, it is generating 100k dns lookups) and most of them are for ipv6 reverse lookups
May 30 23:47:10 dnsmasq[1228]: query[PTR] f.9.0.f.e.c.7.0.9.0.a.7.0.3.c.8.8.1.c.7.e.2.2.1.5.8.b.3.d.7.b.a.ip6.arpa from 192.168.1.135
May 30 23:47:11 dnsmasq[1228]: query[A] f.9.0.f.e.c.7.0.9.0.a.7.0.3.c.8.8.1.c.7.e.2.2.1.5.8.b.3.d.7.b.a.ip6.arpa from 192.168.1.135
May 30 23:47:11 dnsmasq[1228]: query[AAAA] f.9.0.f.e.c.7.0.9.0.a.7.0.3.c.8.8.1.c.7.e.2.2.1.5.8.b.3.d.7.b.a.ip6.arpa from 192.168.1.135
May 30 23:47:11 dnsmasq[1228]: query[PTR] 9.7.8.e.4.e.f.6.5.e.d.2.2.1.c.d.c.8.f.5.b.9.e.c.2.9.9.4.4.3.8.0.ip6.arpa from 192.168.1.135
May 30 23:47:11 dnsmasq[1228]: query[A] 9.7.8.e.4.e.f.6.5.e.d.2.2.1.c.d.c.8.f.5.b.9.e.c.2.9.9.4.4.3.8.0.ip6.arpa from 192.168.1.135
May 30 23:47:11 dnsmasq[1228]: query[AAAA] 9.7.8.e.4.e.f.6.5.e.d.2.2.1.c.d.c.8.f.5.b.9.e.c.2.9.9.4.4.3.8.0.ip6.arpa from 192.168.1.135
https://github.com/grafana/beyla/blob/main/pkg/transform/name_resolver.go#L149
added as part of
#745
We'll have a look on this, thank you for the report. In the meanwhile, you can disable the name resolver node with this line in your configuration file:
name_resolver: null
I managed to track this down a bit further in my local test environment - they all have to do with go_nethttp and go_grpc connection parsing - I see that this was recently changed with https://github.com/grafana/beyla/pull/725/files
there are multiple possible problems
a) one of them that we are accidentally parsing an IPv6 address where there is only IPv4 AND also parsing all zeros or just a single 1 as an IPv6 address (of course neither of them will reverse looking)
to exclude these, I am testing with a local workaround
b) even after this change, random strings are parsed as IPV6 addresses into the bpfConnectionInfoT struct
in the same branch, I am printing parts of the parsed trace to see what it is ,
I see http connection (this one is from jfrog)
HTTPRequestTraceToSpan ipv6 host: 7469666163746f72792f323032342d30:13361, peer: 656d2f6c6f67732f75736167652f6172:11575, type: 1, path: /artifactory/api/v1/system/logs/usage/artifactory/2024-07-14/21a1e9c6e710-metadata-consumption-usage, pathLen: 100, conn: tifactory/2024-0em/logs/usage/ar,status: 403, conLen: 1021
and grpc connections (this is in our own software)
HTTPRequestTraceToSpan ipv6 host: 2f6170692f76312f7175657279000000:5008, peer: 007cb3dc5a06290100504f5354000000:0, type: 2, path: /RemoteFetchEntityData/GetEntityDataMediationQueryRequests2, pathLen: 59, conn: /api/v1/query|Z)POST,status: 0, conLen: 0
ending up in the connection info (I am printing a string version of the connection struct - which should not make sense)
Thanks for the input, @esara ! We will try to reproduce it locally and provide a fix.
thanks for #1019 it helped to solve most of the problems, I still see bogus ipv6 addresses for go_grpc server spans
IPv4-mapped ipv6 hostname: 005f1ace40742500002a320d0a24340d:5008, peer: 80010000070000000a0448ca40742500:0, type: 2, method: , path: /RemoteFetchEntityData/GetEntityDataMediationQueryRequests2,,status: 0
IPv4-mapped ipv6 hostname: 007b05d66374250000c1d23764742500:5008, peer: f000000008000000047b05d663742500:0, type: 2, method: , path: /RemoteFetchEntityData/GetEntityDataMediationQueryRequests2,,status: 0
IPv4-mapped ipv6 hostname: 003cab2c76742500002a320d0a24340d:5008, peer: 80010000090000000a45fa2976742500:0, type: 2, method: , path: /RemoteFetchEntityData/GetEntityDataMediationQueryRequests2,,status: 0
IPv4-mapped ipv6 hostname: afbe5beff85267fb1c9cf3e58ddddfac:5008, peer: 37f7556f2ff155125362541e95d9c18a:0, type: 2, method: , path: /RemoteFetchEntityData/GetEntityDataMediationQueryRequests2,,status: 0
IPv4-mapped ipv6 hostname: 34c14c4a000000000000000000000000:5008, peer: ab878c72574fb2ea39c5e725fb5184f8:0, type: 2, method: , path: /RemoteFetchEntityData/GetEntityDataMediationQueryRequests2,,status: 0
2024-07-17 02:27:57.71722757 (5.217959ms[5.196875ms]) GRPC_SRV 0 /RemoteFetchEntityData/GetEntityDataMediationQueryRequests2 [5f:1ace:4074:2500:2a:320d:a24:340d as 5f:1ace:4074:2500:2a:320d:a24:340d:5008]->[8001:0:700:0:a04:48ca:4074:2500 as 8001:0:700:0:a04:48ca:4074:2500:0] size:0B svc=[causely/gateway go] traceparent=[00-893c5278a99d625b1dbe9bccd895d7e1-49da8b31ae333550-01]
2024-07-17 02:27:57.71722757 (5.962959ms[5.914834ms]) GRPC_SRV 0 /RemoteFetchEntityData/GetEntityDataMediationQueryRequests2 [7b:5d6:6374:2500:c1:d237:6474:2500 as 7b:5d6:6374:2500:c1:d237:6474:2500:5008]->[f000:0:800:0:47b:5d6:6374:2500 as f000:0:800:0:47b:5d6:6374:2500:0] size:0B svc=[causely/gateway go] traceparent=[00-bc3d074db6137aca8dcd90be2928dfb8-69fb09a7d5cba749-01]
2024-07-17 02:27:57.71722757 (9.222667ms[9.112542ms]) GRPC_SRV 0 /RemoteFetchEntityData/GetEntityDataMediationQueryRequests2 [3c:ab2c:7674:2500:2a:320d:a24:340d as 3c:ab2c:7674:2500:2a:320d:a24:340d:5008]->[8001:0:900:0:a45:fa29:7674:2500 as 8001:0:900:0:a45:fa29:7674:2500:0] size:0B svc=[causely/gateway go] traceparent=[00-d56926fa9f14ff6482568247e4be83c2-fc204f7414374776-01]
2024-07-17 02:27:58.71722758 (12.570041ms[12.518ms]) GRPC_SRV 0 /RemoteFetchEntityData/GetEntityDataMediationQueryRequests2 [afbe:5bef:f852:67fb:1c9c:f3e5:8ddd:dfac as afbe:5bef:f852:67fb:1c9c:f3e5:8ddd:dfac:5008]->[37f7:556f:2ff1:5512:5362:541e:95d9:c18a as 37f7:556f:2ff1:5512:5362:541e:95d9:c18a:0] size:0B svc=[causely/gateway go] traceparent=[00-88ea2b39dede50b6cfd464e7c7c3a9e7-61f567f5e2008c0e-01]
2024-07-17 02:27:58.71722758 (5.089875ms[5.069375ms]) GRPC_SRV 0 /RemoteFetchEntityData/GetEntityDataMediationQueryRequests2 [34c1:4c4a:: as 34c1:4c4a:::5008]->[ab87:8c72:574f:b2ea:39c5:e725:fb51:84f8 as ab87:8c72:574f:b2ea:39c5:e725:fb51:84f8:0] size:0B svc=[causely/gateway go] traceparent=[00-8362d8ae23a76e320031bbd1c175cb7c-ef0636a6ba7647db-01]