swgp-go: Userspace WireGuard Proxy with Minimal Overhead
database64128 opened this issue ยท 4 comments
TL;DR: It's similar to #88, but lives in userspace. It can proxy WireGuard traffic without affecting tunnel MTU.
Project URL: https://github.com/database64128/swgp-go
WireGuard traffic can be easily identified due to the following properties:
- Cleartext message type: 1, 2, 3, 4
- Handshake occurs every 2 minutes.
- Handshake packets have a fixed size (148/92/64 for initiation/response/cookie).
- Handshake initiation and response packets usually have an all-zero MAC2 field in the end.
The GFW will block the remote peer's UDP port for a few days after about a week's continuous usage. The blocking only happens to IPv4 remote peers. Although for now you can avoid being blocked by using IPv6, eventually the GFW will tighten restrictions on IPv6, and this won't work anymore.
Proxying WireGuard traffic via general-purpose proxies adds a lot of unnecessary overhead and reduces tunnel MTU. WireGuard does its own authentication, so the proxy protocol only needs to do some padding and obfuscation.
swgp-go is tailor-made for WireGuard traffic. It has 2 proxy modes:
- Zero overhead: The first 16 bytes of all packets are encrypted using an AES block cipher. The remainder of handshake packets (message type 1, 2, 3) are also randomly padded and encrypted using an XChaCha20-Poly1305 AEAD cipher to blend into normal traffic. This mode does not affect tunnel MTU.
- Paranoid: Pad all types of packets without exceeding MTU, then XChaCha20-Poly1305 encrypt the whole packet. Data packets are padded because:
- The length of a WireGuard data packet is always a multiple of 16.
- Many IPv6 websites cap their outgoing MTU to 1280 for maximum compatibility.
The zero-overhead mode has proven to be good enough for proxying through the GFW without getting blocked. The paranoid mode might be useful against hypothetical statistical analysis of packet sizes in the future.
To help users configure the correct MTU on WireGuard tunnels, swgp-go calculates and prints a recommended tunnel MTU on startup and for each new proxy session. To prevent performance issues caused by misconfigured tunnel MTU, swgp-go sets DF on the socket to drop all oversized packets, so WireGuard tunnels with a misconfigured MTU simply won't work.
Userspace proxy programs usually have poor UDP performance due to significant syscall overhead. swgp-go tackles this problem by using recvmmsg(2)
and sendmmsg(2)
on Linux. iperf3
speed tests have shown that it increases throughput by about 51% (results posted at shadowsocks/shadowsocks-org#194 (comment)).
WireGuard traffic can be easily identified due to the following properties:
...
The GFW will block the remote peer's UDP port for a few days after about a week's continuous usage. The blocking only happens to IPv4 remote peers.
The delay of blocking is very interesting. As you have explained, the Wireguard traffic is easy to identify; and real-time blocking of Wireguard appears to be feasible in practice as many Russian ISPs had reportedly blocked Wireguard traffic (by accident) in September 2021. Then why does the censor wait for a week to block Wireguard when they could have accurately and efficiently blocked it in real time?
One hypothesis is that Wireguard uses UDP, making it easy for an attacker to weaponize its residual censorship; the censor is thus reluctant to block it in real time; however, the censor could avoid this problem by not using residual censorship at all.
One hypothesis is that Wireguard uses UDP, making it easy for an attacker to weaponize its residual censorship; the censor is thus reluctant to block it in real time; however, the censor could avoid this problem by not using residual censorship at all.
WireGuard does not respond to replay or invalid packets at all. The censor should be able to deploy 3-tuple or 4-tuple residual censorship in real time after seeing a handshake response packet. An attacker would then have to send source-spoofed packets from both sides to cause damage.
Then why does the censor wait for a week to block Wireguard when they could have accurately and efficiently blocked it in real time?
I think it's simply because not that many people use WireGuard to circumvent the GFW, so it's kind of treated as low priority. Note that the GFW only started blocking WireGuard on IPv4 this February.
I posted a summary of this thread (swgp-go, blocking in China) to the WireGuard mailing list.
https://lists.zx2c4.com/pipermail/wireguard/2022-June/007638.html
the GFW only started blocking WireGuard on IPv4 this February
Anecdotal evidence, I've observed my tunnel port being blocked since ~ a year ago, in my case the block usually kick in within one day, but sometimes it goes away for a week or two. A weak obfuscation in about 30 lines of eBPF stopped that.
Also,
The GFW will block the remote peer's UDP port for a few days
The censor should be able to deploy 3-tuple or 4-tuple residual censorship in real time after seeing a handshake response packet.
Interestingly, The UDP port block seems to be implemented by dropping any UDP traffic from $TUNNEL_SERVER:$TUNNEL_PORT to endpoints in China, but not vice versa.