schweikert/fping

Performance Bottleneck

zywh opened this issue · 1 comments

zywh commented

I started using fping to ping/measure ~400K targets(IPs) but I can't push more than 8Mbps (transit) per VM (ESXi)

Environment

ESXi with different "Linux" guest OS . The server has 40 sockets and 80 logic CPUs and ~100G mem and 10G NIC connected to ISP CORE router directly. It's pretty decent machine and setup. I wrote a python wrapper to muti- thread "fping" and collect result. Here is initial result

  1. Using 1 VM , I'm able to ping ~200K targets using "-c5 -i4 -b12 -t500 -r1" around 60 second

Threads (python) = 72
ping bucket (per fping) = 500

CPU ( 8x vCPU )is kind of busy but OK. I tried more vCPU up to 32 and it doesn't make difference

  1. Same setup if I double threads or increase "number of target per FPING". it will run "faster" however I notice rtt increase and packetloss become unreliable

  2. Ubuntu 21 and Centos 8 . They have similar performance

  3. Linux "Alpine". It's much slower with same setup.

  4. If fping interval (-i) is changed to 2ms, it's faster but result (rtt/packetloss) become unreliable

  5. if fping interval (-i) is changed to 10ms, it will slow down each round (200K) to ~90 seconds

  6. I looked around Linux kernel and adjusted few ICMP/socket parameters . No luck

  7. tried multiple process as well. It's similar result as multi-thread

  8. I tried few python native ICMP packages and none of them is reliable/faster comparing to "fping"

Basically I can do 200K targets in 60s per VM.. PPS = 400,000 * 5 / 60 ~ 16K PPS . The network traffic is max at ~8Mbps with stable result. With 3 x VM , I can do ~600K targets. It's quite amazing but like to understand if I can push more per VM

Does someone have clue where the bottom neck is? Linux ICMP socket kernel?

zywh commented

Python wrapper is here as reference

https://github.com/zywh/pyfping