TechnikEmpire/WinDivertSharp

Lowering Latency

ryandotnet opened this issue · 3 comments

Hello! I'm back again with more attempted work with WinDivert and as usual I'm loving the C# binding / wrapper.

I was wondering if there are any parameters or practices that help lower latency to it's absolute minimum? The usage of WinDivert here is for latency-sensitive games. I've already tried Overlapping I/O (RecvEx / SendEx) and Multi-Threading.

Right now for each packet that passes through with a TCP ping test, modifying and re-injecting only adds 0.1ms of latency per packet, pinged every second. However, this builds up as packets flow through like crazy in games and is extremely noticeable, causing network stutters / jitters in-game.

I currently have WinDivert redirect all game packets to a local proxy that then forwards them to the final destination. I have already ruled out the local proxy being a cause of extra latency.

I'm not sure if this problem has to do with WinDivert not able to redirect packets instantly as they come (like TCP_NODELAY, causing them to build up latency), or if it's because WinDivert is queueing the packets in any way. I feel like if it was the first reasoning, Overlapping / async I/O should've fixed it, but it makes no noticeable impact.

If you need any examples of my code or it would be better for me to ask questions via. Email or Discord, etc, please let me know!

Thank you for your time ^^.

Hi there.

I've actually written my own driver so I've long abandoned WinDivert.

This issue is one reason why I've abandoned WinDivert (aside from it being flagged as Malware by everyone, including Microsoft Driver Signing Portal).

The problem is the way WinDivert is written, it comes in to the network stack like a wrecking ball and essentially forces the entire network stack to pop out of kernel space, into user space, then back into kernel space. It even totally destroys the mechanisms inside of Windows Filtering Platform that are used to make multiple drivers play nice. Just totally rips the packets out of kernel space by deep copying them, dropping them, and waiting for the user to scoop them up.

So unfortunately there isn't anything that can be done here. My driver is written in C++ and feeds to a local proxy and there is essentially 0 impact on network performance. Why? Because the driver leaves everything alone in the kernel space and actually respects the WFP subsystem of the OS.

I'm not trashing WinDivert, basil made these innards of the OS available to everyone via the user space without needing their own EV certificate. That's great. It just comes at great cost and your example is one of them.

All the best.

I see.. that makes a lot of sense. Sorry if this re-opens the issue, I just have one more question.

Would you say that using WFP over WinDivert here will alleviate this problem, or WFP would do and perform similarly in this situation?

Thank you once more.

You have to write your own driver. Windivert is built on top of WFP. Modern network drivers are build on top of it.

Unfortunately that's a rather extensive and complex process. But yes that's where you have to go to get maximum performance.