Use of dpdk
Closed this issue · 2 comments
Hello,
Sorry to bother you about this, I was wondering, why is dpdk used (as opposed to raw sockets)? I saw in the paper it says "We utilize DPDK 20.08 [18] for optimized network performance and communication between the database nodes," I was wondering if there is a substantial improvement in using DPDK (like does it change the perfornance profile / change what the bottleneck is), or is it "just" a faster networking stack to improve performance by say some constant factor? (Reason I ask, I was seeing if I needed to setup DPDK, or if I could use your UDPCommunicator, but replace with raw sockets, in my experiments).
Thank you!
Hi,
DPDK is a zero-copy kernel-bypass network polling driver that is performance-wise in many ways better than raw sockets. Polling for packets instead of relying on interrupts by the OS reduces latency and overhead. A single core can easily saturate a 10Gbps network interface and there is no way to achieve this with the linux kernel stack. I also implemented ways to avoid any additional memory allocation, e.g. the response is written into the same mbuf as the requesting packet. Also with DPDK I did not experience any packet drops at these high line-rates.
The UDPCommunicator was for myself a way to test the database ease its development, I did not indent it for benchmark scenarios. For measuring I recommend you to use DPDK. The setup might sound a bit complicated, but it is basically a script to switch the OS driver of the NIC to the poll-mode driver. Then properly pin the application to the socket where the NIC lies to avoid any interconnect traffic, setup hugepages and start the application. BTW, the interface becomes unuseable by other applications that do not build on DPDK because of the different driver.
TLDR: DPDK > Linux sockets
Oh ok, thank you for this information!