Binding to wildcard "0.0.0.0" then connecting throws `ConnectException` on macOS
armanbilge opened this issue · 9 comments
Reproducer in 6c3ea7e.
epollcat/tests/shared/src/test/scala/epollcat/TcpSuite.scala
Lines 244 to 254 in 6c3ea7e
On my macOS 11 machine the println
above gives met this:
/0:0:0:0:0:0:0:0:60833
(the port number varies per-run)
Furthermore, if I change the connect(...)
to this:
connect(new InetSocketAddress("0.0.0.0", addr.asInstanceOf[InetSocketAddress].getPort))
Then the test passes.
@LeeTibbert could this be another case of getting an IPv4-compatible address instead of an IPv4-mapped address?
could this be another case of getting an IPv4-compatible address instead of an IPv4-mapped address?
Arman, I think you have it exactly!
I will have to trace through the code. When I did the PR to the similar issue I let
::0 pass without conversion because that is the IPv6 wildcard. I have to dive
into the InetSocketAddress("0.0.0.0", 0)
construction path to see how it
picks up its interior InetAddress.
Arman,
I've spent a long morning tracing this and think I have an understanding of two of the things, plural,
going wrong. I need to write some C to track down a third. I also need to check how SN handles
a translation.
Overnight I will work on coming up with a good description of "what", then we can work out
possible solutions. Not to keep you hanging but any description I write now would be gibberish
and cause more confusion.
Progress but no solution, then again, one can progress around a ring and not
get to the end.
I ratted around and found a python IPv6 client/server pair that I had been using
last Summer (2022). Adapted them to this Issue and fired them up on Linux. All worked.
I then fired them up on macOS 13.0.mumble. Everything worked. Rats! I had been
expecting a connection to ::0 to fail on macOS. The both server programs reported
that the connection came in on ::1, which is sensible.
Thinking that python might be "helping" me under the covers, I cobbled together
a C client/server pair which suits this issue. Ran on Linux, everything passed, as expected.
Ran on macOS, and everything passed. Double Rats!! Strange situation where something
working is cause for dismay.
I think the code thru the server is doing a sensible thing (but plenty of room for discussion later).
I need to look more closely at the connect code and trace through, knowing that connecting
to ::0, strange as it might be, works in python and C.
More as I discover it. Time consuming as it is, I am glad you discovered this issue.
Thanks for the update :) I also ratted around in Node.js and libuv, and am hitting the same wall. Surely some other implementation has encountered these "quirks" and taken defensive measures, but I'm yet to stumble on it.
However, being unable to reproduce in a C implementation ... is interesting and unexpected 😅 is there any possibility that Scala Native is doing something during startup/initialization that could lead to this? It would explain why a vanilla C implementation behaves differently.
I believe the simple C programs worked and epollcat revealed the underlying OS failure/behavior
is because the former were written to block until the connect()
completed and epollcat is async.
More next sprint, when my mind has untangled.
As promised, the factors behind PR #93.
There are a number of factors which make this Issue and the proposed fix interesting.
Most of the below could each take a week or so pouring over the relevant International
Engineering Task Force (IETF) Request for Coments (RFC)s. I believe that some of the
never became official but are still widely used or common practice.
-
The special nature of 0.0.0.0 & ::0
Both are defined as a "wildcard" or "any" address on the
local machine. I believe that 0.0.0.0 is "real" address,
with some semantics. I also believe that ::0 is defined
as not really existing but indicative of the "pick one" action.What this special meaning means is that the ::FFFF:0:0:0:0
IPv4 mapped IPv6 address may or may not be handled in
the same way as (substituted for) 0.0.0.0. -
Early allocation of socket fd, which in most cases is IPv6.
-
If the address passed to bind is "0.0.0.0", i.e. IPv4, and the
fd passed to bind() is IPv6, then bind() must use the IPv6
form of "wildcard". (see "fixes" below for IPv4 alternative)
listen()'ing on IPv6 wildcard will also listen on IPv4, unless
the system is specifically configured otherwise.So the debug printf's just after the bind show the
"astonishing" result of the server_ch listening on
the ::0 address. -
Strange nature of connecting to 0.0.0.0 or ::0.
I know of no RFC which actually prohibits
connect()'ing to 0.0.0.0 or ::0 or which describes
what that means. However doing so is somewhat
strange. What does "connect() to 'Any' address"
actually mean?IPv6 is likely to be stricter about this behavior than
IPv4.After the fact, it is not surprising to see implementation defined
behavior when connect()'ing to 0.0.0.0 or ::0. -
macOS specific behavior.
On macOS, at least 12.n & 13, a server_ch listen()'ing on 0.0.0.0
appears to accept() a connect() to 0.0.0.0. It is hard to say that
this works as one expects, since this is undefined or, at least, obscure,
behavior. One can say that it works the way that an explicit connect()
to 127.0.0.1 (IPv4 localhost) would.A server_ch listen()'ing on ::0 refuses the connection. Fair enough,
if we are in the land of undefined behavior.
Summary:
The test code is written to send a ::0 wildcard address to the client_ch,
which then attempts to connect() to it, evoking os specific behavior.
Possible fixes:
-
Change the test
Re-write the test so that it does not do something "unusual" like
trying to connect() to either 0.0.0.0 or ::0. The test is semantically
fragile. The base note in this Issue gives an example of this fix. -
Special case connect()'ing to ::0 on macOS.
This is the fix provided in PR #93. Special cases are ugly to the
eye, a code smell, and almost always have a high maintenance cost.In the PR, the epollcat code is modified in the macOS case to do what
the Linux C library appears to be doing: have connect() to ::0 become
the well defined connect() to ::1 (IPv6 localhost). -
Change epollcat bind.
There is could invite a long discussion. My time is limited but
let me sketch out some points. I am at the limits of my
understanding of epollcat & FS2, so please forgive me
or ignore, if I speak total nonsense.There are at least two paths, depending on the immutability or not
of the fd passed to EpollAsyncServerSocketChannel#binda) The better path: immutable
epollcat bind() is defined returning a
AsynchronousServerSocketChannel
.
It currently returns itself 'this'.I suspect that bind() could be implemented to create a new "fixedup"
AsynchronousServerSocketChannel
and return that. I think that the .use immediately after the .evalTap() would use this "fixed" up ASSChannel.The "fixup" would involve a late evaluation of the type of the socket fd. That is, if the address passed in
is IPv4 and the java property java.net preferIPv6Addresses is the default false (I think epollcat does not
currently handle that java property. SN 0.4.n & 0.5.0 does/should. So there is some complexity here),
then close the socket fd passed in, open a new IPv4 socket and call listen(). If successful, return a new
ASSChannel using the new IPv4 socket fd and then new bound InternetAddress in the SocketAddress.
Both sources of truth now correspond and the new ASSChannel is now "well-formed".
(With current code, the ASSChannel after bound is betwixt & between. One source of truth, the fd, is
correct, but the bytes inside its SocketAddress report a wrong, stale story.)This approach uses the special natures of bind() & 0.0.0.0/::0 to mangle the rule
"Use IPv6 implementation sockets consistently & map IPv4 addresses" to "Give me an IPv4 address for
binding and I will use an IPv4 socket, otherwise I will use an IPv6 socket"b) making the bind() constructor fd a var.
This would continue to return "this" but, if presented with an IPv4 address, would close the fd passed in, get a new IPv4 socket fd, and use that. This still leaves a mismatch between two sources of truth. One is the address of the (new IPv4) fd var and the other is the bytes within the SocketAddress of 'this'. More would would need to be done to synchronize those, if feasible.
Conclusion:
So you thought special casing was time consuming and ugly.
Thanks for the detailed write-up. Suffice to say, it's taken me more than a few passes for all most of it to sink in :)
I think we can all agree that a large part of the problem is the early allocation of the socket file descriptor, before we know whether we want AF_INET or AF_INET6. I blame the poor design of the JDK in this area which leaves us in a bad position. I am not sure if either of proposals has semantics that match the JDK, which would make it difficult to use in cross-compiling code. The real answer is, implement a different (better) API.
Something that bugs me is what feels like an "asymmetry": specifically, that you can bind a socket to 0.0.0.0
/::0
, but if you get the local address of said socket and attempt to connect to it, it is undefined behavior. What good is a local address if you cannot connect to it?
Linux & macOS IPv4 obviously agree with you (as do I).
A lot turns on "get the local address of said socket".
The situation comes about because of the interaction of two "magic" items: "Any" address & bind changing
its own socket address. IPv6 is pretty strict about ::0 "Any" only making sense as an object for a server bind.
There is some room for discussion that "get local socket address" on a socket bound to ::0 should return
::1. (and that bind() should be changing its InetSocketAddress() to the same -- injecting mutability).
Rolling the discussion way back, connect(0.0.0.0) & connect(::0) should both be rejecting the connection
as "connect to any address" makes no sense. That horse left the gate a long time ago.
Life in network land. And a reason to have good libraries, such as epollcat, where a lot
of the mess is abstracted away.