aquasecurity/tracee

Network events cause high CPU and latency

yanivagman opened this issue · 0 comments

Description

When enabling some of the network events (e.g. net_packet_dns_request and net_packet_dns_request events) on environments that have network intensive workloads, CPU usage of these workloads becomes high, and network throughput lower.
Collecting some statistics with bpftool on such an environment shows that our network programs are at the top (when sorted by run_cnt). More specifically, these programs: cgroup_bpf_run_filter_skb, trace_security_socket_sendmsg, trace_security_socket_recvmsg, cgroup_skb_egress, cgroup_skb_ingress.

We need to either optimize those programs or find a different approach to collect network events.

Affected events:

  • net_packet_dns
  • net_packet_dns_request
  • net_packet_dns_response
  • net_packet_ipv4
  • net_packet_ipv6
  • net_packet_tcp
  • net_packet_udp
  • net_packet_icmp
  • net_packet_icmpv6
  • net_packet_http
  • net_packet_http_request
  • net_packet_http_response
  • net_flow_tcp_begin
  • net_flow_tcp_end

Affected features:

  • pcap

Output of tracee version:

(paste your output here)

Output of uname -a:

(paste your output here)

Additional details