xiaorouji/openwrt-passwall2

[Bug]: xray CPU 高占用

organics2016 opened this issue · 5 comments

描述您遇到的bug

passwall2 xray 使用一段时间后出现高CPU占用. 所有 直连域名 DNS 全部失败(这个BUG有一定概率导致dnsmasq 挂掉 ,如果dnsmasq 没挂 则不会出现解析失败 ,但CPU仍然高占用), 所有 远程域名 DNS 正常.
image

复现此Bug的步骤

复现具备的条件(具备条件后 会不定期的出现,但是不清楚是什么事件触发这个问题),

  1. wan 口使用 pppoe 拨号
    image
  2. dnsmasq setting
    image
  3. passwall log
    image

运行一段时间 , 短则几小时 , 长则 1天半左右 就会出现问题 . 然后抓包.
image
上面的抓包信息会一直刷... 我搞不懂为什么 wan口的IP 会疯狂的给自己发 UDP. 也不知道怎么处理..

1.有时passwall 会自动重启, 不清楚原因,但可以肯定不是自动订阅导致的. 在passwall 自动重启后会概率导致这个问题出现.
2.有时运营商 IP到期, 会重拨, WAN口IP变更 . passwall 没自动重启 也会概率导致这个问题.
目前观察的现象是这样.. 总之很难定向复现. 每次问题出现后. 只需要重启passwall 或者 重新拨号 问题就解决了.

您想要实现的目的

CPU使用率正常, 并且不需要 手动重启或者重播.

CPU高占用 的原因应该就是上面抓包信息. 但是我不知道怎么解决这个问题.. 还是我设置有问题 ..求大佬帮帮.

日志信息

xray core 的日志 , 看起来没什么问题.

2024/07/03 05:50:46 [Info] [2018435992] proxy/dokodemo: received request for 192.168.0.234:56300
2024/07/03 05:50:46 [Info] [30074340] proxy/dokodemo: received request for 192.168.0.234:56299
2024/07/03 05:50:46 [Info] [30074340] app/dispatcher: sniffed domain: update.code.visualstudio.com
2024/07/03 05:50:46 [Info] [30074340] app/dispatcher: taking detour [default] for [tcp:update.code.visualstudio.com:443]
2024/07/03 05:50:46 [Info] [30074340] transport/internet/tcp: dialing TCP to tcp:124.71.34.215:4038
2024/07/03 05:50:46 192.168.0.234:56299 accepted tcp:13.107.246.59:443 [tcp_redir -> default]
2024/07/03 05:50:46 [Info] [2018435992] app/dispatcher: sniffed domain: az764295.vo.msecnd.net
2024/07/03 05:50:46 [Info] [2018435992] app/dispatcher: taking detour [default] for [tcp:az764295.vo.msecnd.net:443]
2024/07/03 05:50:46 [Info] [2018435992] transport/internet/tcp: dialing TCP to tcp:124.71.34.215:4038
2024/07/03 05:50:46 192.168.0.234:56300 accepted tcp:117.18.232.200:443 [tcp_redir -> default]
2024/07/03 05:50:46 [Info] [2018435992] proxy/trojan: tunneling request to tcp:az764295.vo.msecnd.net:443 via 124.71.34.215:4038
2024/07/03 05:50:46 [Info] [30074340] proxy/trojan: tunneling request to tcp:update.code.visualstudio.com:443 via 124.71.34.215:4038
2024/07/03 05:50:47 [Info] [1788726303] proxy/dokodemo: received request for 127.0.0.1:38700
2024/07/03 05:50:47 [Info] [1788726303] app/dispatcher: taking detour [dns-out] for [udp:0.0.0.0:0]
2024/07/03 05:50:47 127.0.0.1:38700 accepted udp:0.0.0.0:0 [dns-in -> dns-out]
2024/07/03 05:50:47 [Info] [1788726303] proxy/dns: handling DNS traffic to udp:127.0.0.1:15354
2024/07/03 05:50:47 [Info] app/dns: failed to lookup ip for domain home.hicloud.com at server UDP:127.0.0.1:15354 > rcode: 3
2024/07/03 05:50:48 [Info] [30074340] app/proxyman/inbound: connection ends > proxy/dokodemo: connection ends > context canceled
2024/07/03 05:50:48 [Info] [2018435992] app/proxyman/inbound: connection ends > proxy/dokodemo: connection ends > context canceled
2024/07/03 05:50:48 [Info] [795789265] app/proxyman/inbound: connection ends > proxy/dokodemo: connection ends > context canceled
2024/07/03 05:50:52 [Info] [2964980139] proxy/dokodemo: received request for 127.0.0.1:46405
2024/07/03 05:50:52 [Info] [2964980139] app/dispatcher: taking detour [dns-out] for [udp:0.0.0.0:0]
2024/07/03 05:50:52 [Info] [2964980139] proxy/dns: handling DNS traffic to udp:127.0.0.1:15354
2024/07/03 05:50:52 127.0.0.1:46405 accepted udp:0.0.0.0:0 [dns-in -> dns-out]
2024/07/03 05:50:52 [Info] [3248040876] proxy/dokodemo: received request for 192.168.0.234:58418
2024/07/03 05:50:52 [Info] [3248040876] app/dispatcher: taking detour [blackhole] for [udp:64.233.170.113:443]
2024/07/03 05:50:52 [Info] [3868121158] proxy/dokodemo: received request for 192.168.0.234:56301
2024/07/03 05:50:52 [Info] [3248040876] app/proxyman/inbound: connection ends > proxy/dokodemo: connection ends > proxy/dokodemo: failed to transport response > io: read/write on closed pipe
2024/07/03 05:50:52 192.168.0.234:58418 accepted udp:64.233.170.113:443 [udp_redir -> blackhole]
2024/07/03 05:50:52 [Info] [3868121158] app/dispatcher: sniffed domain: clients4.google.com
2024/07/03 05:50:52 [Info] [3868121158] app/dispatcher: taking detour [default] for [tcp:clients4.google.com:443]
2024/07/03 05:50:52 [Info] [3868121158] transport/internet/tcp: dialing TCP to tcp:124.71.34.215:4038
2024/07/03 05:50:52 192.168.0.234:56301 accepted tcp:64.233.170.113:443 [tcp_redir -> default]
2024/07/03 05:50:52 [Info] [3868121158] proxy/trojan: tunneling request to tcp:clients4.google.com:443 via 124.71.34.215:4038
2024/07/03 05:50:52 [Info] [371329706] proxy/dokodemo: received request for 192.168.0.234:58418
2024/07/03 05:50:52 [Info] [371329706] app/dispatcher: taking detour [blackhole] for [udp:64.233.170.113:443]
2024/07/03 05:50:52 [Info] [371329706] app/proxyman/inbound: connection ends > proxy/dokodemo: connection ends > proxy/dokodemo: failed to transport response > io: read/write on closed pipe
2024/07/03 05:50:52 192.168.0.234:58418 accepted udp:64.233.170.113:443 [udp_redir -> blackhole]
2024/07/03 05:50:53 [Info] [3238423388] proxy/dokodemo: received request for 192.168.0.234:58418
2024/07/03 05:50:53 [Info] [3238423388] app/dispatcher: taking detour [blackhole] for [udp:64.233.170.113:443]
2024/07/03 05:50:53 192.168.0.234:58418 accepted udp:64.233.170.113:443 [udp_redir -> blackhole]
2024/07/03 05:50:53 [Info] [3238423388] app/proxyman/inbound: connection ends > proxy/dokodemo: connection ends > proxy/dokodemo: failed to transport response > io: read/write on closed pipe
2024/07/03 05:50:54 [Info] [985795741] proxy/dokodemo: received request for 192.168.0.234:58418
2024/07/03 05:50:54 [Info] [985795741] app/dispatcher: taking detour [blackhole] for [udp:64.233.170.113:443]
2024/07/03 05:50:54 [Info] [985795741] app/proxyman/inbound: connection ends > proxy/dokodemo: connection ends > proxy/dokodemo: failed to transport response > io: read/write on closed pipe
2024/07/03 05:50:54 192.168.0.234:58418 accepted udp:64.233.170.113:443 [udp_redir -> blackhole]
2024/07/03 05:50:55 [Info] [1185092439] proxy/dokodemo: received request for 127.0.0.1:42824
2024/07/03 05:50:55 [Info] [1185092439] app/dispatcher: taking detour [dns-out] for [udp:0.0.0.0:0]
2024/07/03 05:50:55 [Info] [1185092439] proxy/dns: handling DNS traffic to udp:127.0.0.1:15354
2024/07/03 05:50:55 127.0.0.1:42824 accepted udp:0.0.0.0:0 [dns-in -> dns-out]
2024/07/03 05:50:55 [Info] [4141480430] proxy/dokodemo: received request for 127.0.0.1:47593
2024/07/03 05:50:55 [Info] [4141480430] app/dispatcher: taking detour [dns-out] for [udp:0.0.0.0:0]
2024/07/03 05:50:55 [Info] [4141480430] proxy/dns: handling DNS traffic to udp:127.0.0.1:15354
2024/07/03 05:50:55 127.0.0.1:47593 accepted udp:0.0.0.0:0 [dns-in -> dns-out]
2024/07/03 05:50:55 [Info] app/dns: failed to lookup ip for domain whsmarthome.hwcloudtest.cn at server UDP:127.0.0.1:15354 > rcode: 3
2024/07/03 05:50:55 [Info] app/dns: UDP:127.0.0.1:15354 got answer: www.baidu.com. TypeA -> [157.148.69.80 157.148.69.74] 52.037705ms
2024/07/03 05:50:55 [Info] [869773104] proxy/dokodemo: received request for 127.0.0.1:53001
2024/07/03 05:50:55 [Info] [869773104] app/dispatcher: taking detour [dns-out] for [udp:0.0.0.0:0]
2024/07/03 05:50:55 [Info] [869773104] proxy/dns: handling DNS traffic to udp:127.0.0.1:15354
2024/07/03 05:50:55 127.0.0.1:53001 accepted udp:0.0.0.0:0 [dns-in -> dns-out]
2024/07/03 05:50:56 [Info] app/dns: UDP:127.0.0.1:15354 got answer: www.baidu.com. TypeAAAA -> [[2408:8756:c52:1aec:0:ff:b013:5a11] [2408:8756:c52:1107:0:ff:b035:844b]] 50.231372ms
2024/07/03 05:50:56 [Info] [3600718046] proxy/dokodemo: received request for 192.168.0.234:58418
2024/07/03 05:50:56 192.168.0.234:58418 accepted udp:64.233.170.113:443 [udp_redir -> blackhole]
2024/07/03 05:50:56 [Info] [3600718046] app/dispatcher: taking detour [blackhole] for [udp:64.233.170.113:443]
2024/07/03 05:50:56 [Info] [3600718046] app/proxyman/inbound: connection ends > proxy/dokodemo: connection ends > proxy/dokodemo: failed to transport response > io: read/write on closed pipe

截图

No response

系统相关信息

luci-app-passwall2 | 1.29-1
Xray | 1.8.13
dnsmasq-full | 2.90-2
OpenWrt | OpenWrt 23.05.2 r23630-842932a63d / LuCI openwrt-23.05 branch git-23.306.39416-c86c256

其他信息

passwall2 分流真的很好用 , 功能也比clash整洁不少..非常好用.
这个问题我已经跟了两周, 这是目前收集到我认为有用的信息, 如果需要补充, 我会尽量提供.

#594
不知道是不是同一个问题,,但是表象很像.

好像是因为PPPOE 拨号的 IP没被追加到 直连规则导致的,
但是WAN口的IP会变, 我怎么动态的把IP追加到直连规则里呢?

已定位到问题:
pppoe 拨号下 WAN 口IP 没有被添加到 nftables 直连规则, 导致UDP流量进入passwall核心,转发后导致死循环.

问题在这里:

WAN_IP=$(get_wan_ip)
if [ -n "${WAN_IP}" ]; then
[ -n "${is_tproxy}" ] && nft "add rule inet fw4 PSW2_MANGLE ip daddr ${WAN_IP} counter return comment \"WAN_IP_RETURN\"" || nft "add rule inet fw4 PSW2_NAT ip daddr ${WAN_IP} counter return comment \"WAN_IP_RETURN\""
fi
unset WAN_IP

tproxy 代理模式下 wan ip 直连会被添加到 PSW2_MANGLE 规则, 否则 wan ip 会被添加到 PSW2_NAT , 但是passwall的默认转发配置是: TCP 流量走 REDIRECT , UDP 流量走 tproxy . BUG就在这里, 即使用户选择REDIRECT 进行TCP代理 , 仍然会有部分 UDP 流量通过 tproxy 代理进入核心. 如果这部分UDP 中如果恰好来自奇怪的IP 就会导致这部分UDP进入核心造成循环转发.(我这里不知道怎么稳定复现用WAN IP作为UDP源的流量, 雷神加速器会在部分情况下复现这种流量).

已知临时解决方法有两种:
第一种 : 像下面这样配置 , 让 passwall 不转发任何UDP 就可以了, 但是所有UDP也无法被代理.
image

第二种 : vi /usr/share/passwall2/nftables.sh 定位到 725 改成下面这样, 保存并重启passwall (确保你使用的防火墙是 nftables, 如果使用 iptables 需要自行编写符合规则的逻辑):

	WAN_IP=$(get_wan_ip)
	if [ -n "${WAN_IP}" ]; then
		nft "add rule inet fw4 PSW2_MANGLE ip daddr ${WAN_IP} counter return comment \"WAN_IP_RETURN\""
		nft "add rule inet fw4 PSW2_NAT ip daddr ${WAN_IP} counter return comment \"WAN_IP_RETURN\""
		echolog "  - [$?]追加WAN_IP到nftables:${WAN_IP}"
	fi
	unset WAN_IP

Stale Issue