OTP26 `erlang:port_command` breaking change?
mkuratczyk opened this issue · 4 comments
Describe the bug
OTP26.0-rc2 includes a change that optimises erlang:port_command
. This change, while certainly appreciated!, breaks applications that are using erlang:port_command
as a workaround to the issue of selective receive in gen_tcp
as explained in https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbit_common/src/rabbit_writer.erl#L388-L405. RabbitMQ can't handle any connections if started on OTP26 (of course at this point we are not declaring it as OTP26-compatible, so users shouldn't do that, but we are looking for a way to ship releases that would work on both 25 and 26 and a compile-time check is insufficient here).
Using erlang:port_command
like that might be considered unsupported, but we know it's at least:
RabbitMQ
emqtt
Erlang PostgreSQL Database Client
Therefore, this is a breaking change for at least some applications out there.
We'd appreciate some help or guidance how to resolve this.
- Is the whole
erlang:port_command
instead ofgen_tcp:send
a performance myth that should be retired? - Can
erlang:port_command
in OTP26 be made backwards compatible, so that apps like the above still work? - If not, I guess at least it should be added to the Potential Incompatibilities list
To Reproduce
side note: the following test case was written by AI :)
-module(send_tcp).
-export([start/0]).
start() ->
{ok, Socket} = gen_tcp:connect("localhost", 3456, [binary, {packet, 0}]),
erlang:port_command(Socket, "Hello from erlang:port_command"),
gen_tcp:close(Socket).
This exits with einval
(I have nc -l 3456
running in another terminal):
1> send_tcp:start().
** exception exit: einval
Expected behavior
Ideally: erlang:port_command
would be backwards compatible, while still optimised, but I guess that's not an option
Alternatively:
- It'd behave the old way in some cases (backwards compatible, but not optimised in some cases?)
- Backport the optimisation to OTP25 so that the updated code works well with older OTP versions
- Mention this change in the potential incompatibilities list
Affected versions
OTP26.0-rc2 and newer
the commit that changed the behaviour: f17f802#diff-86e7e31a3a23b92ea754ad259d8ffd6eb347deb522218fae1803878bf9d15e8b
Additional context
slack thread:
https://erlanger.slack.com/archives/C055DJA49/p1681483290185709
This change does not change port_command()
at all. This is a change in the internal protocol between prim_inet
and the inet-driver which also is completely backwards compatible. I don't see why it should be mentioned as being incompatible when it is not. Misusing the functionality by issuing port_command()
operations like that on your own is not supported.
We'd appreciate some help or guidance how to resolve this.
How about?
Rel = list_to_integer(erlang:system_info(otp_release)),
F = if
Rel >= 26 -> fun gen_tcp:send/2;
Rel < 25 -> fun workaround_module:send/2
end,
persistent_term:put(gen_tcp_send_workaround, F).
In epgsql
direct port call was implemented (the idea was borrowed from RabbitMQ) because epgsql
tends to accumulate a large message queue (due to active=true
by default). And pre-OTP26 gen_tcp:send
had a performance issues when the process calling it has a long message queue.
I think in epgsql
we will use -if(?OTP_RELEASE >= 26). ... use normal gen_tcp:send -else. ... use port_command ... -endif.
.
However even on pre-OTP26 gen_tcp:send
should become less of a problem, since we now support active=N
mode
Thanks for the suggestion with persistent_term
. We've been testing it since yesterday and it does the trick and we can't find any measurable cost of this lookup.
I'm happy to close the issue. I hope not too many apps will be in a similar position when upgrading to OTP26. We certainly need to update the network-related parts of RabbitMQ as they are 10+ years old for the most part.