sc_sock thread-safety
svladykin opened this issue · 11 comments
I'm going to have a multi-threaded system with multiple poll objects and poller threads. I want to be able to add/del sockets to a poll object from multiple threads (e.g. server accepts a non-blocking socket in one thread and dispatches it to a random poller thread, or a worker thread does non-blocking socket connect and dispatches the socket to a random poller thread as well).
As far as I can see epoll
and kqueue
already support this out of the box. Still sc_sock_poll
has some additional unprotected shared state:
- Resizeable
sc_sock_poll.events
(andcount
+cap
). This is fixable by removing resizing and havingevents
buffer of a constant size (like 4096 or so). - Shared
sc_sock_poll.err
. Looks like the right way to handle it is to fillsc_sock.err
instead ofsc_sock_poll.err
on add/del failures, but this is a backward compatibility breaking change (in a minor way though, it breaks only error handling path and not critically: the outdated code will see empty error message insc_sock_poll_err()
after failed add/del operations, which does not look too bad).
Windows as usual would require many more situps but still must be doable :)
@tezc What do you think about it? I can implement this, are you OK with reviewing/merging such a change?
@svladykin Sorry, I'm a bit late (I was mostly afk this weekend)
Why don't you have an event queue between threads?(maybe a pipe). Is it because you are trying to avoid some syscalls? Normally, you can have a pipe between threads and send events through it. I guess it'll cause 1 pipe write + 1 pipe read + 1 more (say epoll_ctl) vs 1 direct epoll_ctl?
Regarding your points:
1- That's fine. As we discussed previously, this is necessary if you are going to have thousands of sockets anyway.
2- This also sounds good enough. Btw, I don't maintain %100 API compatibility between versions (just to have some flexibility). So, small changes are always okay if it is going to make code better.
As far as I can see, Windows version would require some sort of mutex. For epoll
and kqueue
, we don't need anything but these two fixes above?
At the end, we'll just make sc_sock_poll_add()
and sc_sock_poll_del()
thread-safe.
Polling itself, sc_sock_poll_wait()
and other functions will not be thread safe. (No need to do that I guess).
Please go ahead, maybe it'd be better to discuss over a PR :)
Yes, my point is that if epoll
and kqueue
already thread-safe and support this scenario, then from performance and simplicity point of view I should not reinvent the wheel. Windows is of course on the opposite side of things here, it will become more complicated, but in my view it is better to write complex code once for a single platform within the library than to have more complex cross-platform machinery at the application level.
Yes, mutex is only needed for Windows. Also, if I understand correctly I can add sc_sock_pipe
to sc_sock_poll
and receive READ events for it? Though it seems like it is not possible to make that pipe non-blocking right now, but I guess it should not be important. Btw, p->fdt.fd = p->fds[0]
line seems to be missing for Windows.
My idea was to have a pipe protected by a mutex, this pipe must be registered to a poll object and when threads send new sockets to that pipe poller thread calling sc_sock_poll_wait()
will wake up, receive and add/del these sockets. Also, if we know that sc_sock_poll_wait()
is not being called right now (probably because we are actually in the polling thread) we can skip the pipe and add/del sockets directly.
I will try to come up with the prototype soon.
1-
Btw, p->fdt.fd = p->fds[0] line seems to be missing for Windows.
Oops, shame.
Personally, I don't use sc_sock on Windows. Linux is the primary OS. (I don't even use it actively these days). So, there might be some bugs on other OSes or with other setups. For example, FreeBSD CI takes a long time, probably there is something about DNS resolution.
Also, I remember we had some changes in sc_sock but couldn't push them here due to licensing issues :(
2-
Also, if I understand correctly I can add sc_sock_pipe to sc_sock_poll and receive READ events for it?
Definetely.
You can take a look at my toy project (Linux only) (I don't update this one anymore but this is the only public repo I can share):
https://github.com/tezc/resql/blob/b7eb428d1b412637a7ad1e4148d704da9461b815/src/server.c#L163
Here, I create pipes and add them to the poll.
3-
Though it seems like it is not possible to make that pipe non-blocking right now, but I guess it should not be important
Why do you need a non-blocking pipe? You often just write a struct to send a task/message and the message is usually very small. It will be written atomically.
Here, how I used:
https://github.com/tezc/resql/blob/b7eb428d1b412637a7ad1e4148d704da9461b815/src/snapshot.c#L302
4-
My idea was to have a pipe protected by a mutex
Maybe I'm missing something, why do you need a mutex? You can just write events to the pipe atomically.
I see sc_sock_pipe error path is not thread-safe on Windows but I guess we can change it and set errno on error.
Thanks for the pointers!
I guess you are right and in practice send
will be atomic for small values, this is not explicitly guaranteed though.
If we assume that send
is atomic and we always add/del sockets using the pipe, then mutex should not be needed. And that is fine for my use case, but I was also thinking about a single threaded case when sockets are never added from other threads, in my understanding taking/releasing mutex under no contention must be faster than using pipe. Maybe it does not worth the complexity, I'm not sure.
As for non-blocking pipes, I was thinking about some tricky failure scenarios when a poller thread stucks on processing and stops receiving from the pipe and the thread that tries to add a socket will hang on pipe send
in blocking mode, while in non-blocking mode I can decide how to handle this situation (try to register the socket with another poller or fail the operation right away). Btw, it would be interesting to test if such a failure scenario can break pipe send atomicity.
POSIX guarantees atomicity for pipes if you write less than PIPE_BUF
but I forgot Windows, ofcourse :) So, there is no pipe on Windows, it is just a regular socket connection, I guess it won't be atomic when it's called from multiple threads.
For Windows there is a check that sets error on pipe read (and the same for pipe write):
rc = recv(p->fds[0], (char *) data, len, 0);
if (rc == SOCKET_ERROR || (unsigned int) rc != len) {
sc_sock_pipe_set_err(p, "pipe recv() : err(%d) ", WSAGetLastError());
What is the point of checking for rc != len
when it is perfectly valid behaviour and WSAGetLastError()
will return something unrelated?
As the pipe/socket is blocking currently, it should not read/write partially I guess. Probably, hoping here it would give something meaningful on a partial read/write. In the app, I always have a check for partial read/write and abort with the error message of the socket.
Btw, I see Windows added _pipe()
: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/pipe?view=msvc-170
Maybe I missed this API before, maybe it's new, not sure. Also, not sure about atomic operations as on Unixes but it might be useful for us?
I don't see anything in Windows docs saying that _pipe()
does atomic reads/writes as in POSIX.
Moreover, I actually need partial reads with my current approach: I fill a growing buffer of operations to be done on sockets by multiple threads (the buffer is protected by CriticalSection) and only the first thread has to actually write a signal byte to a pipe to wakeup a poller thread. In the poller thread I have to drain that pipe, but this happens asynchronously, so I will not know how many bytes and when I will have to drain (I don't want to do a blocking read
call for every signal byte on every poll operation), so the plan is to just go with a reasonably sized small buffer (like 8 bytes) and it should be fine as far as we have READ events and partial reads for that pipe. You already can look into this in #99 in sc_sock_poll_signal()
and sc_sock_poll_wait()
.
My point was that rc != len
check does not seem to do anything useful inside the library if the application has to double check it anyways. Also, it sets the error message to something completely misleading. I would just remove this check and assume that on Windows pipe allows for partial reads and it is responsibility of an application to deal with it.
Sure, we can delete that check.