shadowsocks/shadowsocks-rust

负载均衡测速报错 error: Connection refused (os error 111)

pexcn opened this issue · 27 comments

pexcn commented

我的配置文件如下:

{
    "locals": [
        {
            "local_address": "127.0.0.1",
            "local_port": 1234,
            "protocol": "redir",
            "tcp_redir": "tproxy",
            "udp_redir": "tproxy"
        },
        {
            "local_address": "127.0.0.1",
            "local_port": 5300,
            "protocol": "tunnel",
            "forward_address": "8.8.8.8",
            "forward_port": 53
        }
    ],

    "servers": [
        //
        // FASTEST
        //
        {
            "server": "xxx.xxx.xxx.xxx",
            "server_port": 2047,
            "password": "${SHADOWSOCKS_PASSWORD}",
            "method": "chacha20-ietf-poly1305",
            "mode": "tcp_only",
            "plugin": "qtun-client",
            "plugin-opts": "host=my.domain.com"
        },
        {
            "server": "127.0.0.1",
            "server_port": 2004,
            "password": "${SHADOWSOCKS_PASSWORD}",
            "method": "2022-blake3-chacha8-poly1305",
            "mode": "udp_only"
        },

        //
        // TCP
        //
        {
            "server": "111.111.111.111",
            "server_port": 1984,
            "password": "${SHADOWSOCKS_PASSWORD}",
            "method": "2022-blake3-chacha8-poly1305",
            "mode": "tcp_only"
        },
        {
            "server": "111.111.111.112",
            "server_port": 1984,
            "password": "${SHADOWSOCKS_PASSWORD}",
            "method": "2022-blake3-chacha8-poly1305",
            "mode": "tcp_only"
        },
        {
            "server": "111.111.111.113",
            "server_port": 1984,
            "password": "${SHADOWSOCKS_PASSWORD}",
            "method": "2022-blake3-chacha8-poly1305",
            "mode": "tcp_only"
        },

        //
        // UDP
        //
        {
            "server": "127.0.0.1",
            "server_port": 2001,
            "password": "${SHADOWSOCKS_PASSWORD}",
            "method": "2022-blake3-chacha8-poly1305",
            "mode": "udp_only"
        },
        {
            "server": "127.0.0.1",
            "server_port": 2002,
            "password": "${SHADOWSOCKS_PASSWORD}",
            "method": "2022-blake3-chacha8-poly1305",
            "mode": "udp_only"
        },
        {
            "server": "127.0.0.1",
            "server_port": 2003,
            "password": "${SHADOWSOCKS_PASSWORD}",
            "method": "2022-blake3-chacha8-poly1305",
            "mode": "udp_only"
        }
    ],

    "balancer": {
        "max_server_rtt": 5,
        "check_interval": 30,
        "check_best_interval": 15
    }
}

启动命令如下:

ssservice local --config ./local.json -vv 2>&1 | grep ping_balancer | grep -v "got AEAD"

其中日志第三行出现了 Connection refused (os error 111), 而实际是可以使用的。
我这里使用了 qtun 插件,服务器中只监听了 UDP 的 2047 端口,请问这是否和这个原因有关?

[shadowsocks_service::local::loadbalancing::ping_balancer] kept best TCP server 111.111.111.111:1984
[shadowsocks_service::local::loadbalancing::ping_balancer] kept best UDP server 127.0.0.1:2004
[shadowsocks_service::local::loadbalancing::ping_balancer] failed to check TCP server xxx.xxx.xxx.xxx:2047, error: Connection refused (os error 111)
[shadowsocks_service::local::loadbalancing::ping_balancer] updated remote TCP server xxx.xxx.xxx.xxx:2047 (score: 8000)
[shadowsocks_service::local::loadbalancing::ping_balancer] checked remote UDP server 127.0.0.1:2004 latency with 16 ms
[shadowsocks_service::local::loadbalancing::ping_balancer] updated remote UDP server 127.0.0.1:2004 (score: 7)
[shadowsocks_service::local::loadbalancing::ping_balancer] checked remote UDP server 127.0.0.1:2003 latency with 174 ms
[shadowsocks_service::local::loadbalancing::ping_balancer] updated remote UDP server 127.0.0.1:2003 (score: 1071)
[shadowsocks_service::local::loadbalancing::ping_balancer] checked remote UDP server 127.0.0.1:2002 latency with 181 ms
[shadowsocks_service::local::loadbalancing::ping_balancer] updated remote UDP server 127.0.0.1:2002 (score: 79)
[shadowsocks_service::local::loadbalancing::ping_balancer] checked remote TCP server 111.111.111.113:1984 latency with 330 ms
[shadowsocks_service::local::loadbalancing::ping_balancer] updated remote TCP server 111.111.111.113:1984 (score: 139)
[shadowsocks_service::local::loadbalancing::ping_balancer] checked remote TCP server 111.111.111.112:1984 latency with 350 ms
[shadowsocks_service::local::loadbalancing::ping_balancer] updated remote TCP server 111.111.111.112:1984 (score: 378)
[shadowsocks_service::local::loadbalancing::ping_balancer] checked remote TCP server 111.111.111.111:1984 latency timeout, elapsed 5001 ms
[shadowsocks_service::local::loadbalancing::ping_balancer] updated remote TCP server 111.111.111.111:1984 (score: 8000)
[shadowsocks_service::local::loadbalancing::ping_balancer] checked remote UDP server 127.0.0.1:2001 latency timeout, elapsed 5001 ms
[shadowsocks_service::local::loadbalancing::ping_balancer] updated remote UDP server 127.0.0.1:2001 (score: 8000)
dev4u commented

去掉qtun插件,或者换在一个稳定的网络环境下使用他。

pexcn commented

能兼容这个插件吗?我想让它搭配 load balancer 一起使用

I don't think it has anything related to "compatibility".

Since the connect() call on a TCP socket returned Connection refused (os error 111), then maybe your qtun-client was not listening on the port that sslocal expected.

Is qtun-client supports the environment variables defined in SIP003?

dev4u commented

不是qtun不兼容,而是当你使用的网络发生了变化,ss client重用之前建立好的连接有可能已经broken,导致你看到的异常。

Of course not. The connection in ping balancer is always new.

Since this server has plugin, so the Connection refused (os error 111) indicated that sslocal failed to connect to plugin's port that qtun-client should listen to.

If the qtun-client comes from this project: https://github.com/shadowsocks/qtun, then it should support SIP003. Hmm.

dev4u commented

Of course not. The connection in ping balancer is always a new.

我说qtun呢。

Then it won't fail when sslocal tried to connect() to its listening port.

pexcn commented

不是qtun不兼容,而是当你使用的网络发生了变化,ss client重用之前建立好的连接有可能已经broken,导致你看到的异常。

网络是稳定的,服务端和客户端都是刚刚启动,并且另外再开了一个客户端测试过是可用的。

Since the connect() call on a TCP socket returned Connection refused (os error 111), then maybe your qtun-client was not listening on the port that sslocal expected.

也许这和 qtun 的 listen 参数有关?它默认是监听了 8138 端口

Is qtun-client supports the environment variables defined in SIP003?

应该是支持的,它在源码里也有体现 https://github.com/shadowsocks/qtun/blob/master/src/args/mod.rs#L6-L26

pexcn commented

https://github.com/shadowsocks/qtun/blob/9e1834e00f38d0424f5252fe6331533bfc50d207/src/client.rs#L49-L57

The listen IP and port should come from environment variables.

Did you see the line listening on ... from qtun-client? https://github.com/shadowsocks/qtun/blob/9e1834e00f38d0424f5252fe6331533bfc50d207/src/client.rs#L92-L94

是的,我发现尽管配置了 ...;listen=127.0.0.1:2047, 日志里也会出现监听了 127.0.0.1:8138

pexcn commented

也许我应该把这个 bug 反馈给 qtun 😄

Did you really see the log listening on ... from qtun-client?

pexcn commented

Did you really see the log listening on ... from qtun-client?

是的,日志如下:

/tmp # ssservice local --config ./local.json -v
2023-02-02T14:34:24.548326221+08:00 INFO  [172:140030336721968] [shadowsocks_rust::service::local] shadowsocks local 1.15.2 build 2023-01-17T14:27:56.620285783+00:00
2023-02-02T14:34:24.550215301+08:00 DEBUG [172:140030336721968] [shadowsocks::plugin] started plugin "qtun-client" on 127.0.0.1:38649 <-> xxx.xxx.xxx.xxx:2047 (175)
[2023-02-02T06:34:24Z INFO  qtun_client] listening on 127.0.0.1:8138
2023-02-02T14:34:27.551698778+08:00 DEBUG [172:140030336721968] [shadowsocks_service::local::loadbalancing::ping_balancer] failed to check TCP server xxx.xxx.xxx.xxx:2047, error: Connection refused (os error 111)
2023-02-02T14:34:32.553126510+08:00 INFO  [172:140030336721968] [shadowsocks_service::local::loadbalancing::ping_balancer] chose best TCP server yyy.yyy.yyy.yyy:1984
2023-02-02T14:34:32.553231968+08:00 INFO  [172:140030336721968] [shadowsocks_service::local::loadbalancing::ping_balancer] chose best UDP server 127.0.0.1:2004
pexcn commented

我发现 qtun 只监听了 UDP 端口,会不会是这个原因?😮

root@HK ~/docker-compose/net/shadowsocks-rust/shadowsocks-rust-data # netstat -naop | grep qtun
udp        0      0 0.0.0.0:2047            0.0.0.0:*                           462031/qtun-server   off (0.00/0/0)
unix  3      [ ]         STREAM     CONNECTED     8615865  462031/qtun-server
unix  3      [ ]         STREAM     CONNECTED     8615866  462031/qtun-server

Of course, qtun-server should only listen to UDP port, this is QUIC.

2023-02-02T14:34:24.550215301+08:00 DEBUG [172:140030336721968] [shadowsocks::plugin] started plugin "qtun-client" on 127.0.0.1:38649 <-> xxx.xxx.xxx.xxx:2047 (175)
[2023-02-02T06:34:24Z INFO  qtun_client] listening on 127.0.0.1:8138

qtun-client should listen to 127.0.0.1:38649, you may have to open an issue to qtun, or compile a new one with the latest code. I couldn't see any problems on the master branch.

pexcn commented

已经是最新版编译的了,还是有问题,我稍后会向 qtun 再提个 issue.
感谢解答 😄

pexcn commented

@zonyitoo 大佬,还想问下,我把 qtun 插件替换成了 xray-plugin, 使用它的 quic 协议。
其余配置项和 qtun 完全一样,拿 windows 的客户端测试它是能正常使用的。但是负载均衡的日志里它报了与 qtun 不同的错误 error: unexpected end of file,请问这是 xray-plugin 的问题吗?

相关信息:

root@OpenWrt:~# netstat -nap | grep xray-plugin
tcp        0      0 127.0.0.1:39367         0.0.0.0:*               LISTEN      26746/xray-plugin
tcp        0      0 127.0.0.1:39367         127.0.0.1:52048         ESTABLISHED 26746/xray-plugin

2023-02-02T17:41:13.223997708+08:00 DEBUG [57:140240382620720] [shadowsocks::plugin] started plugin "xray-plugin" on 127.0.0.1:39367 <-> xxx.xxx.xxx.xxx:2047 (64)

2023-02-02T17:42:48.770454819+08:00 DEBUG [38:140083682825264] [shadowsocks_service::local::loadbalancing::ping_balancer] failed to check TCP server xxx.xxx.xxx.xxx:2047, error: unexpected end of file

It looks ss-rust doesn't pass SS_PLUGIN_OPTIONS correctly to the plugin in load balance mode. @zonyitoo can you double check?

Hmm, plugins are launched with exactly the same code:

if let Some(ref opt) = plugin.plugin_opts {
cmd.env("SS_PLUGIN_OPTIONS", opt);
}

Load balancer is always started even if there is only 1 server.

Could you please print the value of svr_cfg.plugin() in log and see if it contains all parameters correctly. @pexcn

for server in &mut servers {
let server = Arc::get_mut(server).unwrap();
let svr_cfg = server.server_config_mut();
if let Some(p) = svr_cfg.plugin() {
// Start Plugin Process
let plugin = Plugin::start(p, svr_cfg.addr(), PluginMode::Client)?;
svr_cfg.set_plugin_addr(plugin.local_addr().into());
plugins.push(plugin);
}
}

Just add info!("{:?}", p) inside the for loop.

pexcn commented

Just add info!("{:?}", p) inside the for loop.

好的,我晚点试试😄

pexcn commented

@zonyitoo plugin_args 似乎是空的,以下是配置和日志。

配置:

{
    "locals": [
        {
            "local_address": "127.0.0.1",
            "local_port": 12345,
            "protocol": "redir",
            "tcp_redir": "tproxy",
            "udp_redir": "tproxy"
        }
    ],

    "servers": [
        {
            "server": "xxx.xxx.xxx.xxx",
            "server_port": 2047,
            "password": "xxxxxxxxxxxxxxxxx",
            "method": "chacha20-ietf-poly1305",
            "mode": "tcp_only",
            "plugin": "./qtun-client",
            "plugin-opts": "host=example.com"
        },
        {
            "server": "xxx.xxx.xxx.xxx",
            "server_port": 2049,
            "password": "xxxxxxxxxxxxxxxxx",
            "method": "chacha20-ietf-poly1305",
            "mode": "tcp_only",
            "plugin": "./xray-plugin",
            "plugin-opts": "fast-open;mode=quic;host=example.com;loglevel=info"
        }
    ],

    "balancer": {
        "max_server_rtt": 5,
        "check_interval": 30,
        "check_best_interval": 15
    }
}

日志:

root@ARCH ~/shadowsocks-rust/target/debug # ./ssservice local --config ./local.json -v
2023-02-03T22:03:33.325795219+08:00 INFO  [6773:139947323509696] [shadowsocks_rust::service::local] shadowsocks local 1.15.2 build 2023-02-03T13:51:12.540902582+00:00
2023-02-03T22:03:33.329036745+08:00 INFO  [6773:139947323509696] [shadowsocks_service::local::loadbalancing::ping_balancer] PluginConfig { plugin: "./qtun-client", plugin_opts: None, plugin_args: [] }
2023-02-03T22:03:33.330807188+08:00 DEBUG [6773:139947323509696] [shadowsocks::plugin] started plugin "./qtun-client" on 127.0.[2023-02-03T14:03:33Z INFO  qtun_client] listening on 127.0.0.1:8138
0.1:34077 <-> xxx.xxx.xxx.xxx:2047 (6777)
2023-02-03T22:03:33.344471928+08:00 INFO  [6773:139947323509696] [shadowsocks_service::local::loadbalancing::ping_balancer] PluginConfig { plugin: "./xray-plugin", plugin_opts: None, plugin_args: [] }
2023-02-03T22:03:33.357091969+08:00 DEBUG [6773:139947323509696] [shadowsocks::plugin] started plugin "./xray-plugin" on 127.0.0.1:34883 <-> xxx.xxx.xxx.xxx:2049 (6780)
2023/02/03 22:03:33 Xray 1.7.3 (Xray, Penetrates Everything.) Custom (go1.20 linux/amd64)
2023/02/03 22:03:33 A unified platform for anti-censorship.
2023/02/03 22:03:33 [Warning] core: Xray 1.7.3 started
2023/02/03 22:03:33 127.0.0.1:46104 accepted tcp:127.0.0.1:0
2023/02/03 22:03:36 127.0.0.1:46106 accepted tcp:127.0.0.1:0
2023-02-03T22:03:36.375806051+08:00 DEBUG [6773:139947323509696] [shadowsocks_service::local::loadbalancing::ping_balancer] failed to check TCP server xxx.xxx.xxx.xxx:2047, error: Connection refused (os error 111)
2023-02-03T22:03:38.459856755+08:00 DEBUG [6773:139947323509696] [shadowsocks_service::local::loadbalancing::ping_balancer] failed to check TCP server xxx.xxx.xxx.xxx:2049, error: unexpected end of file
2023-02-03T22:03:38.460549296+08:00 INFO  [6773:139947323509696] [shadowsocks_service::local::loadbalancing::ping_balancer] chose best TCP server xxx.xxx.xxx.xxx:2047
2023-02-03T22:03:38.461702672+08:00 INFO  [6773:139947321403136] [shadowsocks_service::local::redir::tcprelay] shadowsocks TCP redirect (tproxy) listening on 127.0.0.1:12345
PluginConfig { plugin: "./qtun-client", plugin_opts: None, plugin_args: [] }

Hmm, that's unexpected!

Ah, I see. It should be "plugin_opts" instead of "plugin-opts"!!!!!

pexcn commented

@zonyitoo 确实是打错了,改成 plugin_opts 就可以了,感谢指点😅😅

Ah, I see. It should be "plugin_opts" instead of "plugin-opts"!!!!!

Maybe we should add #[serde(deny_unknown_fields)] to the config struct to prevent issues like this in the future.

Update: opened #1106