erlang/otp

prim_inet don't clean up monitor if port_command crashed

zzydxm opened this issue · 5 comments

Describe the bug
When the port owner dies because for example a linked process died, the erlang:port_command can crash and fall into this clause
https://github.com/erlang/otp/blob/master/erts/preloaded/src/prim_inet.erl#L616-L619

Problem (1): This will cause a uncleared monitor message so we should demonitor here
Problem (2): The late monitor message appears to us in format
{'DOWN',#Ref<0.754902196.4233101336.24126>,process,#Port<0.55426>,{timeout,{...}}}
which type is process but object is a port, which is not expected anyway and we think it is a bug in BEAM

To Reproduce
No local reproduce yet but from the code problem (1) is straightforward
Problem (2) reproduce see comment below

Expected behavior
There should not be any DOWN message when prim_inet:send failed

Affected versions
OTP 26

Here is a simple way to reproduce the issue manually:

(fun() ->
_ = catch erlang:process_flag(trap_exit, true),
Parent = self(),
{Pid, PidMon} = spawn_monitor(fun() ->
Port = erlang:open_port({spawn, "sleep 10"}, []),
Parent ! {self(), Port},
timer:sleep(timer:seconds(10)),
exit({timeout, expired})
end),
Port = receive {Pid, P} -> P end,
_ = catch erlang:link(Port),
PortMon = erlang:monitor(port, Port),
_ = spawn(fun() -> timer:sleep(timer:seconds(5)), erlang:exit(Pid, {timeout, killed}) end),
#{
    port => Port,
    port_mon => PortMon,
    pid => Pid,
    pid_mon => PidMon
}
end)().
% wait 5 seconds...
flush().
Shell got {'DOWN',#Ref<0.3044587589.3799252993.136686>,process,<0.91.0>,
                  {timeout,killed}}
Shell got {'EXIT',#Port<0.3>,{timeout,killed}}
Shell got {'DOWN',#Ref<0.3044587589.3799252993.136688>,process,#Port<0.3>,
                  {timeout,killed}}

Edit: Confirmed that the above behavior is present in both OTP 26.x and OTP 27.0-rc3

It is obviously a bug, and the solution would be simply to add a demonitor(Mref, [flush]) in the catch error: _ -> clause.

But I will have to get the VM guys to take a look at the weird process tag first...

The process tag problem was apparently easy to find. It is on the ToDo list.

Fixed in OTP-26.2.5.1 and planned for OTP-27.0.1.

Awesome, thank you @RaimoNiskanen !