[Bug Report]: DNS poisoning in China
acheong08 opened this issue · 15 comments
In regions such as China, socks5 doesn't cut it due to DNS poisoning. Please add support for socks5h
This wasn't actually much of an issue a few weeks back but it gets the wrong IP resolution ever since I upgraded to Ubuntu 23.04
. Could be an issue there
Could you please provide more details on how you run the tool and what you mean by "wrong DNS resolution"? Do you use --setup auto
? SOCKS4a and SOCKS5h are already supported through the virtual DNS feature, which is enabled automatically. However, this requires that your /etc/resolv.conf
is configured correctly. The --setup auto
feature should do that. In general, you should see the DNS requests in the tun2proxy output. As an example, curl https://example.org
should produce the following output:
[2023-05-25T17:00:39Z INFO tun2proxy::virtdns] DNS query: example.org
[2023-05-25T17:00:39Z INFO tun2proxy::tun2proxy] CONNECT 165.232.78.28:59678 -> example.org:80
[2023-05-25T17:00:40Z INFO tun2proxy::tun2proxy] CLOSE 165.232.78.28:59678 -> example.org:80
In that case, socks5h
is used.
If you only see plain IP addresses in the tun2proxy output, then it is likely that your /etc/resolv.conf
is not correctly configured. Using any nameserver address that is routed through the tunnel interface should do. If you have a DNS server in your LAN, a more specific route than the default route might cause your DNS requests to not be routed through the tunnel.
$ cat /etc/resolv.conf
nameserver 45.90.28.195
nameserver 45.90.30.195
[2023-05-26T00:28:46Z INFO tun2proxy::virtdns] DNS query: chat.openai.com
[2023-05-26T00:28:46Z INFO tun2proxy::tun2proxy] CONNECT 172.26.208.155:41000 -> chat.openai.com:443
[2023-05-26T00:28:47Z ERROR tun2proxy::tun2proxy] Read from proxy: Connection reset by peer (os error 104)
[2023-05-26T00:28:47Z INFO tun2proxy::tun2proxy] CLOSE 172.26.208.155:41000 -> chat.openai.com:443
"Connection reset by peer " happens to all websites blocked by Chinese DNS poisoning.
Do you use --setup auto
Yes
When using curl
with socks5h
variable set, no such issues occur. I'm therefore assuming there is nothing wrong with DNS on the proxy server.
I have looked into this using Wireshark, but I don't really have a good explanation why you would experience issues with banned sites only. Tun2proxy behaves almost exactly like curl, except that curl supports one additional authentication protocol, which causes the difference in the first TCP data packet from the client. tun2proxy with plain curl is on the left, curl with socks5h and without tun2proxy on the right:
If this only affects sites that are blocked by the GFW, might it be the case that the GFW has some simple matching rule for deep packet inspection (DPI) that for some reason does not cover connections with different authentication mechanisms? This seems unlikely, but I cannot currently think of a better explanation.
If you have access to an SSH server outside the GFW, could you run ssh -D 1080 user@server
and tun2proxy --proxy socks5://127.0.0.1:1080 --setup auto --setup-ip <public ip of your ssh server>
? This should open a SOCKS5 proxy on 127.0.0.1:1080 which is tunneled through SSH, so the GFW should not be able to perform DPI. If this works, then maybe it is really caused by DPI.
I do have ssh servers outside GFW, however, I only use this tool when GFW mysteriously blocks all connections to outside the country every few hours with the exception of academic institutions (which I only have socks5 access to).
If this only affects sites that are blocked by the GFW, might it be the case that the GFW has some simple matching rule for deep packet inspection (DPI) that for some reason does not cover connections with different authentication mechanisms? This seems unlikely, but I cannot currently think of a better explanation.
Possibly. Considering I only recently met this problem and was able to use tun2proxy perfectly in the past suggests that it's a relatively new breakthrough on their end
As long as I do not have an environment in which I can reproduce this, I can only make guesses. If you want to, you can try out the gfw branch, which aims to exactly imitate curl behavior. It assumes that GSS-API is not supported by the server and it will break if GSS-API is really negotiated though. If you are interested in testing the DPI hypothesis, providing a local SOCKS5 server through SSH and comparing the behavior to plaintext SOCKS5 connections would still help. Alternatively, you might compile a version of curl without GSS-API support and see if it still works.
Could you explain a bit on how it works?
It doesn't do much. 23baf5d just adds an extra byte. The SOCKS client now advertises that it supports the GSS-API authentication mechanism, although it does not, hoping that the server does not support it either. (It might be more clever to not use the GSS-API identifier but an unassigned/private use identifier as per https://en.wikipedia.org/wiki/SOCKS#SOCKS5, so you do not ever run into a server that supports the authentication mechanism, but since you found out that curl works, I first wanted to exactly reproduce its behavior.)
This byte is the number of supported authentication mechanisms by the client:
https://github.com/blechschmidt/tun2proxy/blob/23baf5d6a8e4d1f7a2197b9d0d32398b7ae8e9d5/src/socks.rs#L169
This byte is the code for the GSS-API mechanism:
https://github.com/blechschmidt/tun2proxy/blob/23baf5d6a8e4d1f7a2197b9d0d32398b7ae8e9d5/src/socks.rs#L171
Given your observations and the fact that you no longer have issues with 23baf5d, I would assume that the GFW has a matching rule that looks for a SOCKS5 connection handshake by performing a byte-by-byte comparison of the request. Most software (curl being one of the exceptions) only supports SOCKS5 without authentication and SOCKS5 with username/password authentication, so they likely have a few rules for these, for example matching 0x05 0x01 0x02
, which would be the begin of a SOCKS5 session with username/password authentication. (The first byte is the SOCKS version, the second byte is the number of supported authentication mechanisms by the client, and the third byte is the identifier of username/password authentication.) I am a bit surprised myself that adding an extra byte seems to bypass the GFW, but maybe it's a software/hardware limitation, or it might be supposed to prevent overblocking.
Thanks!
You're welcome. Could you do me a favor and test whether the gfw2
branch works as well? If it does, I could merge that into master without risking to break workflows with proxies that support GSSAPI. Otherwise, there needs to be an extra command line option before this can be merged.
gfw2
branch worked!
Thanks a lot! Merged it into master.