getsockopt with SOL_SOCKET, SO_ERROR returns -26
Opened this issue · 9 comments
When connecting a non-blocking socket and using poll() to wait for it to become ready for writing, calling getsockopt() on it with SOL_SOCKET, SO_ERROR itself returns 0 and returns the value -26. Looking through the translation table, that should be EINPROGRESS. This is doubly unexpected: It should not return a negative value (on which, to make matters worse, strerror() even seems to crash) and it should not return EINPROGRESS anymore once the socket is connected. My guess is that the last error from the socket is returned, but the connection being established not resetting it to 0.
Looking at the code, I think there's a need for special casing depending on the passed level and option. For SOL_SOCKET, SO_ERROR, _net_convert_error() needs to be called on *opt. In getsockopt, a simple if (level == SOL_SOCKET && optname == SO_SOCKET) *opt = _net_convert_error(*opt) before returning should be sufficient.
That does not solve the part that it still returns EINPROGRESS, but that at least is easily patched in software (treat EINPROGRESS after the poll triggered the same as 0)
Can you try just calling connect() again and see what it returns? Hopefully you'd get errno=EALREADY when it's still connecting, errno=EISCONN when connection succeeded, and some other errno for connection failures.
I wait for it using poll(). When I call connect again, it indeed tells me EISCONN. However, if there is an error, the poll() seems to never return, and I'm stuck there forever. On desktop operating systems, this seems to trigger POLLOUT in any case, and getsockopt can then be used to determine whether connecting was successful or not.
I can confirm that both of these issues (poll
may wait for the entire timeout; getsockopt
returns -26 and requires that connect
EISCONN
workaround hack) still exist in libctru 2.0.0-2.
You shouldn't be calling connect() again. Did you request POLLOUT in your pollfd::events? I use this for non-blocking connects in ftpd and it's never given me an issue on 3DS.
Yes. For reference, here's the connection code in question (note anything that's Socket::
is a thin wrapper around a standard BSD sockets function in this case) and the poll function it calls, which explicitly sets POLLOUT
. This is old code I've been refactoring but I've confirmed that this method of polling a non-blocking socket works on every other platform I've tried that supports poll
.
Here's my version of the connect() workaround as per this issue.
edit: replaced the first two links with permanent links.
I took a look at ftpd and it looks like it's using poll
in a loop with a fixed 100ms timeout, which might be why there aren't any noticeable issues. My code currently provides poll
with a 10 second timeout, which it seems to block for the entirety of before indicating writeability.
It looks like POSIX is pretty vague about when/if poll
should return before the timeout ends (on the other hand, the Linux manual pages say it should return immediately when the socket is ready). I'll probably just work around this issue by splitting it up over multiple poll
calls.
That still leaves the getsockopt
with SOL_SOCKET
, SO_ERROR
issue.
We ran into this same issue when implementing our non-blocking connect for RetroArch's network.