zhboner/realm

大量 FIN_WAIT2 状态的 TCP 连接不会被切断

Closed this issue · 12 comments

目前使用的 2.4.3 版本,使用 Supervisor 守护进程。
配置文件如下:

{
  "log": {
    "level": "info",
    "output": "stdout"
  },
  "dns": {
    "mode": "ipv4_only",
    "protocol": "tcp_and_udp",
    "nameservers": [
      "1.1.1.1:53",
      "8.8.8.8:53"
    ]
  },
  "network": {
    "no_tcp": false,
    "use_udp": true
  },
  "endpoints": [
    {
      "listen": "0.0.0.0:12321",
      "remote": "test2.com:12321"
    },
    {
      "listen": "0.0.0.0:12345",
      "remote": "test1.com:12345"
    }
  ]
}

运行命令:realm -c test.json -n 30000 (不加 -n 参数,通过系统或 supervisor 的配置增加 open_files 也试过一样)

随着运行时间的增长,该进程的连接数会一直变多。

使用 lsof -p PID -nP|grep TCP 查看 realm 的连接发现有大量的 FIN_WAIT2 状态的连接存在。
同时也有一些 CLOSE_WAIT 状态的连接

当连接数达到上限时将无法连接,进程也不会崩溃。

目前用过 2.4.0 , 2.4.3 都存在这个问题

Realm默认其他互联的程序都良好实现,采用的是graceful-shutdown. 请参考build-guides自行编译brutal-shutdown版本.

Realm默认其他互联的程序都良好实现,采用的是graceful-shutdown. 请参考build-guides自行编译brutal-shutdown版本.

意思是这些 FIN_WAIT2 是因为不良连接导致的吗?

注意到之前转发的是 XRay 实现的 VLESS 时没有这种情况,但转发同样是 XRay 实现的 Shadowsocks 时 (SS2022加密) 这种情况出现得很快。
(转发,被转发服务器是同一台,后端 XRay 也是同一个进程。不应该是网络原因吧0.0,难道是 SS 的特性?)

另外 TCP 连接中除了 FIN_WAIT2 之外还有很多
realm 6911 root 3138u sock 0,8 0t0 255991 protocol: TCP

参考其他 issues 中提到的 这个方法,我检查了服务器的 net.ipv4.tcp_tw_reuse 值为 2,打算先改为 net.ipv4.tcp_tw_reuse=1 看看会不会有改善

我其实对 TCP 没有太多深入了解... 主要靠临时查查资料... 只能理解到这了 0.0

后续如果 net.ipv4.tcp_tw_reuse=1 没效果我再试试
自行编译brutal-shutdown版本

感谢

Realm每个传输方向完成传输后都会先关闭写入侧(向对方发送FIN,收到ACK。FIN_WAIT1 => FIN_WAIT2), 等到对方关闭连接(收到对方发来的FIN,发送ACK)后再彻底退出。 如果对方没有正确关闭的话就会一直等待下去。

Maybe related: XTLS/Xray-core#1172 XTLS/Xray-core#1224

Realm每个传输方向完成传输后都会先关闭写入侧(向对方发送FIN,收到ACK。FIN_WAIT1 => FIN_WAIT2), 等到对方关闭连接(收到对方发来的FIN,发送ACK)后再彻底退出。 如果对方没有正确关闭的话就会一直等待下去。

Maybe related: XTLS/Xray-core#1172 XTLS/Xray-core#1224

非常遗憾 net.ipv4.tcp_tw_reuse=1 没效果,我也疑惑为何 net.ipv4.tcp_fin_timeout = 30 不起作用。

经过测试 控制变量的情况下分别转发后端 VLESS采用了SS2022加密的SS (XRay 1.6.0 同一个进程产生),VLESS 不会出现我这个 issues 所述的问题,而 SS 会出现。

我又测试了 gost 最新版本 gost -L=tcp://:port/ip:port 无问题,可能与作者提到的 shutdown 方式有关。

不早了... 明天我测试下后端的 SS 不采用 SS2022 加密方式,看看会不会复现~

经过测试。

SS + SS2022 三种加密 均会产生本 issues 提到的问题。
经过检查其实服务端的 XRay 进程也会有未中断的 TCP 链接。

SS + 以前的加密 (只测试了 aes-128-gcm) 没有问题,一旦切换节点,所有链接全部中断。

另外测试了 Gost 进行转发 gost -L=tcp://:port/ip:port 没有出现问题,
应该是作者所说的 shutdown 方式不一样。

接下来试一下 编译brutal-shutdown版本.

虽然 Gost 转发不会出问题,但感觉问题还是出现在 SS2022 加密上。期待修复

cargo build --release --features 'brutal-shutdown'

增加 brutal-shutdown 编译选项后,测试无问题了。

考虑到易用性,v2.4.4 以后的版本将默认开启brutal-shutdown.

Hi, author of Shadowsocks 2022 here. I'm here because @f4nff spammed shadowsocks/shadowsocks-rust#1190 and referenced this issue.

@syouko

First of all, this has nothing to do with the Shadowsocks 2022 protocol. No changes were made to the behavior exhibited by the proxy stream. A correct implementation would, for both directions, relay data until EOF and then shutdown the write side, and close the socket when both directions are done.

If you suspect something's wrong with the protocol, feel free to try my implementation and open an issue if you still have the same problem.

@zephyrchien

There are 2 possible scenarios I can think of that will result in many proxy connections stuck in TCP_CLOSE_WAIT:

1. NAT/firewall with low timeout

The remote server has sent FIN. We have also sent FIN to the client, which has been successfully ACKed. So far, so good.

However, for whatever reason, the client did not immediately close the connection. Maybe the protocol in-use still allows the client to send more bytes to the server, or maybe the client wants to use close_range(2) later when a cluster of sockets are all done. We don't know and we shouldn't have to care.

The client, unfortunately, is behind a NAT/firewall with a very low timeout. Later when the client wants to send FIN, it could never reach us. Soon the client gives up and declares the connection dead. But the server is still waiting for more payload or that FIN blocked by the NAT/firewall.

There's a simple solution to this: enable SO_KEEPALIVE and set both TCP_KEEPIDLE and TCP_KEEPINTVL to a more reasonable value like 15 seconds. The Go runtime started doing this years ago, and shadowsocks-rust adopted this in 2021.

2. Applications leaking sockets

This is the reason listed in the documentation for the brutal-shutdown feature. But I just don't see how this could help with the situation. Unless the broken application actively closes the socket, the associated resources, and the state in the application are all still there. Is it really worth it to break TCP heuristics just so we could release a tiny bit of memory a little earlier?

So please, for the sake of TCP, remove brutal-shutdown, or at least don't include it in the default features.

BTW, TCP Fast Open is very much not dead. The article you referenced is very much outdated and was written in a quite biased way. Even the Linux kernel made the change 2 years ago to no longer enable the blackhole logic by default: torvalds/linux@213ad73

f4nff commented

@database64128

是io对拷出的问题,没有正确释放另外一端, 而golang有个defer, 是无条件可以触发的,

好好去查下吧,别嘚嘚了.

#[cfg(not(feature = "brutal-shutdown"))]
{
let a_to_b = ready!(a_to_b);
let b_to_a = ready!(b_to_a);
Poll::Ready(Ok((a_to_b, b_to_a)))
}
// brutal shutdown
#[cfg(feature = "brutal-shutdown")]
{
match (a_to_b, b_to_a) {
(Poll::Ready(a), Poll::Ready(b)) => Poll::Ready(Ok((a, b))),
(Poll::Pending, Poll::Ready(b)) => Poll::Ready(Ok((0, b))),
(Poll::Ready(a), Poll::Pending) => Poll::Ready(Ok((a, 0))),
_ => Poll::Pending,
}
}
}
}

I found this issue because of shadowsocks/shadowsocks-rust#1190 .

The feature "brutal-shutdown" ends the whole "copy" process if one direction reached EOF. This is not a correct behavior of a proxy (or tunnel), because it may lose data of one data transfer direction if the other one shutdown early.

I don't think the actual CLOSE-WAIT was caused by this application level "copy" behavior. This shouldn't be the final solution of this issue.

f4nff commented

反正ssserver sslocal出现CLOSE-WAIT就是不对,
一方关闭之后,另一方也要关闭,
golang有个defer,
是无条件可以触发的,
反正golang项目就没有这种情况,
CLOSE-WAIT 会占用tcp的连接数,如果高并发会导致拒绝连接.
CLOSE-WAIT一闪即逝是对的,但是一直存在等待释放就是错误的.

f4nff commented

image
这是sslocal,