Use SocketAsyncEventArgs
Closed this issue · 7 comments
Using SocketAsyncEventArgs could greatly improve performance when compared to using the Begin/End Socket methods.
Benefits:
- Non-blocking.
- Reduce strain on the GC by pre-allocating SocketAsyncEventArgs objects (also doesn't create an IAsyncResult object on each call).
- Can prevent memory fragmentation from occurring.
- SocketAsyncEventArgs does not block when the buffer is full (compared to Begin/End).
Yeah, I looked at that. Although it'll help, it still uses the Begin/End Socket methods plus a lot of locking.
@tylerjwatson can probably elaborate on his plans, since he's taken the lead on overhauling the networking. I think the locking is pretty negligible with the performance increase gained from the current send queue. But then again, I don't really know any of the hard data and am working off what I've been told in Slack.
I'll leave it to Tyler to drop in when he wakes up.
Hi @Mitch528
First of all thank you for your suggestion, it's probably one of the best and most concise ones we've gotten. Indeed you are correct about the SocketAsyncEventArgs
. The problem arises because TSAPI generates so much data that the send buffers end up completely full and EndWrite
will block until it has pumped all data back to the client.
I am overhauling the packet sending mechanism to use a queue which you have already seen. The reason why it still uses BeginWrite
and EndWrite
methods because whilst testing the buffer and getting it stable, I wanted to keep the sending mechanism as close to the real thing™ so I could isolate any funny business whilst it was being designed.
You are incorrect about your locking assumption. The locks protect the segment lists so as to mitigate the possibility that an attempt at locking two segments at the same time for two threads won't cause overlapping in the buffer; I've never seen the segment list go above 10, so locks are only really held to emplace a segment into the segment list. The lock contention is about 3% during my stress test (I actually measured it)
I chose this design because there are small nuances in the way Socket send mechanisms work between Mono and windows. At least on windows the async socket mechanisms use I/O completion ports and an unmanaged P/Invoke backend under the hood but I can't say the same for mono, which appears to just useIAsyncResult
s as a wrapper for sending information down the managed socket[1].
Using a queue in either case, making sure that packets are immediately enlisted will guarantee me predictable performance regardless of I/O implementation. My goals for the send queue are:
- Remove allocations by calls to
new MemoryStream()
by having memory streams point to a locked segment instead of copying buffers (there are some cases where this isn't possible) - Buffers never deallocate, mutate or be copied in any way
- Send work gets shared across one spawned network I/O thread per connected client
- Game loop is unaffected by pumping data to clients
- Make Gen2 happy by not requiring it to pin buffers for the duration of an async send as the buffer never really moves anyway
If you've gotten to this point without getting bored already I salute you, and I should mention that I have had massive success with it already, even after only half implementing it.
[1] https://github.com/mono/mono/blob/master/mcs/class/System/System.Net.Sockets/SocketAsyncEventArgs.cs
As a footnote I am going to leave this open until the sendq has been implemented and the send thread uses SocketAsyncEventArgs
as it's a valid and wise recommendation.
No problem.
Ah, I see. I misunderstood the locking portion of your code.
Yeah, looking at Mono's code, it does seem to use IAsyncResult. However, from what I can tell, it reuses it instead of creating a new one with each call (https://github.com/mono/mono/blob/master/mcs/class/System/System.Net.Sockets/SocketAsyncWorker.cs#L33).
OTAPI