shadowsocks/shadowsocks-org

SIP022: Shadowsocks 2022 Edition

database64128 opened this issue Β· 33 comments

I would prefer EIH to be in another document, because it is not required for personal users.

Made some changes:

  • Removed the EIH section.
  • Added EIH spec in a separate document.

⚠️ Major breaking change

To have better defense against active probes, we made a breaking protocol change. Part of the header, specifically type and timestamp, now sits in the first "length" chunk. After this change, one read call is all it takes to determine whether the request is legitimate or not.

Before this change, the request header in its entirety is in the first payload chunk. Implementations also attempt to squeeze as much payload as possible into the first payload chunk. It is likely that the total size of the first write exceeds the capacity of a single TCP segment (under typical MSS). Therefore, it might take more than one read call to read the first payload chunk. To defend against probes that send one byte at a time, implementations usually drain the connection when an error occurs.

This behavior has several drawbacks. First of all, it is very uncommon for a TCP service to read infinitely. TCP services usually closes the connection when invalid data is received. Secondly, it opens the door for resource exhaustion attacks. Anyone can just connect and send an infinite amount of data, or send nothing but open a lot of connections.

With this change, draining the connection is no longer necessary, because on the first read either we have a valid header that you can tell whether it's replay, or the request is illegitimate. We can just forcibly close the connection if something is wrong. The other party does not get to know how many bytes the server has read, because no matter what you send, or how much you send, as soon as it's sent, the connection gets closed.

Spec change commit: Shadowsocks-NET/shadowsocks-specs@a6fe1a1

Without enough difference, it might be a hard job to make people realize Shadowsocks 2022 != Shadowsocks. This could make user suffer from the breaking change.

What about change to a better, more identifiable name, while maintaining the characteristics to show this protocol is sharing most design ideas from Shadowsocks? (But i have no good idea...)

Reading unlimited is surely a problem, but I think allowing unlimited idle connections is not what Shadowsocks protocol should care about, since anyone can open tons of connections on any TCP server. Let's leave this job to the protocol stack in kernel.

TCP
Request stream:
+--------+------------------------+---------------------------+------------------------+---------------------------+---+
|  salt  | encrypted header chunk |  encrypted header chunk   | encrypted length chunk |  encrypted payload chunk  |...|
+--------+------------------------+---------------------------+------------------------+---------------------------+---+
| 16/32B |     11B + 16B tag      | variable length + 16B tag |  2B length + 16B tag   | variable length + 16B tag |...|
+--------+------------------------+---------------------------+------------------------+---------------------------+---+

Response stream:
+--------+------------------------+---------------------------+------------------------+---------------------------+---+
|  salt  | encrypted header chunk |  encrypted payload chunk  | encrypted length chunk |  encrypted payload chunk  |...|
+--------+------------------------+---------------------------+------------------------+---------------------------+---+
| 16/32B |    27/43B + 16B tag    | variable length + 16B tag |  2B length + 16B tag   | variable length + 16B tag |...|
+--------+------------------------+---------------------------+------------------------+---------------------------+---+

The TCP response stream started with the salt of the request stream, both request and response start with the same salt, will it make protocol easily detected?

@yzou The response stream uses a new salt.

@database64128 Thanks, my mistake.

Hello developers,I noticed that you proposed a new url schemeβ€”β€”https://github.com/shadowsocks/shadowsocks-org/wiki/SIP002-URI-Scheme. But this will cause old shadowsocks url scheme not consistent with the new one and the downstream has to make more errforts to adaptation.Could you keep the same url scheme.It's convenient to users and also good to downstream.

@FranzKafkaYu It's not a new URL scheme. The plain method:password format without the extra base64url encoding has existed for a long time, and is well supported by many clients, including shadowsocks-android. We just standardized it this time because URLs are supposed to be human-friendly. Excessive and unnecessary usage of base64 encoding in URLs is a bad practice, and must be discouraged.

About UDP relay, why is it necessary to open a new relay session(UDP socket) based on (source address, port), as there is a session id in the separated header and it's included in the server-to-client header? One client UDP socket is enough, map all:1 UDP connection.

To start a UDP session, the client generates a new random session ID and maintains a counter starting at zero as packet ID

About the packet ID, it should be a global counter, instead of a session packet counter, or else when a session expires on the server, the replay attack will pass the sliding window filter check.

why is it necessary

It's not necessary, but it's easier and much more efficient this way. With one UDP socket per session, your downlink (server -> client) relay loop only has to decrypt, authenticate, and forward packets for the current session. Multiple sessions can take advantage of multiple CPU threads. If you multiplex all UDP sessions into one UDP socket, the entire downlink can only use one CPU thread, and could easily become the bottleneck.

Nevertheless, servers are required to support UDP session multiplexing. So feel free to implement it if you like. :P

when a session expires on the server, the replay attack will pass the sliding window filter check.

This won't happen, because the packet header has a timestamp, and servers are required to have a NAT timeout of no less than 60 seconds. When a session expires, its packets won't pass the timestamp check.

In the 3.2 section of spec:
Each time the proxy receives a packet from a new source address and port, it opens a new relay session

In the 3.2.4 section of spec:
When a packet is received from the client and is successfully validated, the last seen client address SHOULD be updated. Return packets SHOULD be sent to this address. This allows UDP sessions to survive client network changes.

If UDP socket and session are 1:1 map as TCP, the client may be behind NAT in most cases, if the client address changes, all client session source ports change, and return packets will fail until a new packet is received from the client. How can UDP sessions survive client network changes?

If you multiplex all UDP sessions into one UDP socket, the entire downlink can only use one CPU thread
I don't think so. One UDP socket can be polled/processed by multiple threads/processes.

This won't happen, because the packet header has a timestamp, and servers are required to have a NAT timeout of no less than 60 seconds. When a session expires, its packets won't pass the timestamp check.

I didn't find the timestamp check for UDP packets in the spec. The timestamp is not reliable because the UDP packets may not receive in the order of sending. The correct way is to compare the timestamp with the timestamp of the first(smallest counter) packet in the sliding window.

all client session source ports change, and return packets will fail until a new packet is received from the client. How can UDP sessions survive client network changes?

You already answered it yourself. After a packet from the new client address is received, subsequent server-to-client packets are sent to the new address. An address is a host:port combination, not just the hostname or IP.

I didn't find the timestamp check for UDP packets in the spec.

Timestamp check is part of the header validation procedure. The spec is not a step-by-step guide, and does not have to tell you everything you need to do.

The timestamp is not reliable because the UDP packets may not receive in the order of sending. The correct way is to compare the timestamp with the timestamp of the first(smallest counter) packet in the sliding window.

The timestamp check is the same as TCP: time diff must be less than 30 seconds. I don't think it's necessary to keep track of the timestamp boundaries of the sliding window. I also don't see how out-of-order delivery could affect any of this.

You already answered it yourself. After a packet from the new client address is received, subsequent server-to-client packets are sent to the new address. An address is a host:port combination, not just the hostname or IP.

No, in the spec, it'll open a new relay session. The proxy should check session-id existance before open a new relay session when it receives a packet from a new host:port. Even though, for the sessions that the client doesn't send a new packet, the subsequent server-to-client packets will fail.

About timestamp, the spec must tell details for security purposes. The client system clock may be 30+ seconds faster or slower than the server, these clients would fail to connect.

I remember the spec said no need to check timestamp for UDP as it have sliding window filter.

No, in the spec, it'll open a new relay session.

That sentence you referenced is describing how clients handle incoming UDP packets from local clients, not servers. Server behaviors are documented in section 3.2.4.

About timestamp, the spec must tell details for security purposes.

How to perform a timestamp check is well documented in the spec. Just search "timestamp" in the spec and you'll see.

The client system clock may be 30+ seconds faster or slower than the server, these clients would fail to connect.

This is the intended effect.

I remember the spec said no need to check timestamp for UDP as it have sliding window filter.

It never said that. Also why would anyone put a timestamp in UDP headers and ask implementations to ignore it? The MTU of the Internet is only 1500 bytes. I was very conscious of that when designing the protocol. Every extra byte is only there because it's absolutely necessary.

I worked closely with authors of shadowsocks-rust and sing-shadowsocks to have Shadowsocks 2022 properly implemented in these projects. I helped review code and we had lots of conversations. Based on the questions you asked so far, I have some suggestions for you: Before asking more questions, maybe try to implement the protocol in code, or at least take a look at existing implementations?

@FranzKafkaYu It's not a new URL scheme. The plain method:password format without the extra base64url encoding has existed for a long time, and is well supported by many clients, including shadowsocks-android. We just standardized it this time because URLs are supposed to be human-friendly. Excessive and unnecessary usage of base64 encoding in URLs is a bad practice, and must be discouraged.

I do agree that URLs should be human-friendly.But in order to avoid expose too many infos directly,encoding with base64 is a good and convenient method.Users can share there links to others without worrying these links can be crawled by some apps or website spiders.Anyway,I would like to consult some donwstream clients app developers.Sorry to bother you for such things~

How to perform a timestamp check is well documented in the spec. Just search "timestamp" in the spec and you'll see.

I know, but it is only in the TCP section, and not mentioned in the UDP section.

I remember the spec said no need to check timestamp for UDP as it have sliding window filter.

Sorry, it's repeated nonce, not timestamp.

I appreciate your work on the new 2022 protocol. I didn't know the conversations. I implemented AEAD feature to the original python version, I'm trying to implement the 2022 to the python/libev versions according to the spec.

I know, but it is only in the TCP section, and not mentioned in the UDP section.

Appreciate the feedback. Made some changes: Shadowsocks-NET/shadowsocks-specs@9c29d3e

it's repeated nonce, not timestamp.

Only the optional method 2022-blake3-chacha20-poly1305 uses random nonces. It's obvious that the nonce does not have to be checked, because the sliding window filter already filtered out duplicate packets.

In TCP Detection Prevention section, what does the "receive buffer" stand for, the kernel or the application buffer?

If the server closes the connection when data received is not enough for decryption, the connection will be closed even 1-byte packet arrived. If the server doesn't close the connection, it would be DDOS attacked by sending 1-byte packet before TCP timeout timer.

In TCP Detection Prevention section, what does the "receive buffer" stand for, the kernel or the application buffer?

It's the socket's (in-kernel) receive buffer.

If the server closes the connection when data received is not enough for decryption, the connection will be closed even 1-byte packet arrived. If the server doesn't close the connection, it would be DDOS attacked by sending 1-byte packet before TCP timeout timer.

That's why the spec only recommends a few strategies and leaves it up to implementations to decide. There's no perfect solution here. You may implement all of these strategies and let the user select one in configuration.

Is current spec stable?

Yes, it was declared stable on June 12 2022 when I merged Shadowsocks-NET/shadowsocks-specs#2, and has been available in many popular proxy tools for about a year now.

sorry for a comming-new study, and hopes the 2022-edition updates from go-shadowsocks2.

https://shadowsocks.org/doc/sip022.html#_3-1-2-format
A length chunk is a 16-bit big-endian unsigned integer that describes the payload length in the next payload chunk. Servers and clients rely on length chunks to know how many bytes to read for the next payload chunk.
Response fixed-length header:
+------+------------------+----------------+--------+
| type | timestamp | request salt | length |
+------+------------------+----------------+--------+
| 1B | u64be unix epoch | 16/32B | u16be |
+------+------------------+----------------+--------+
Response fixed-length header:
+------+------------------+----------------+--------+
| type | timestamp | request salt | length |
+------+------------------+----------------+--------+
| 1B | u64be unix epoch | 16/32B | u16be |
+------+------------------+----------------+--------+

why do attach a length chunk of payload for both request or response header, even if with a tcp request? i'm not sure if there means both on tcp or udp connections.

https://shadowsocks.org/doc/sip022.html#_3-1-2-format
Response stream:
+--------+------------------------+---------------------------+------------------------+---------------------------+---+
| salt | encrypted header chunk | encrypted payload chunk | encrypted length chunk | encrypted payload chunk |...|
+--------+------------------------+---------------------------+------------------------+---------------------------+---+
| 16/32B | 27/43B + 16B tag | variable length + 16B tag | 2B length + 16B tag | variable length + 16B tag |...|
+--------+------------------------+---------------------------+------------------------+---------------------------+---+

what is the first encrypted payload chunk, before the length chunk?

https://shadowsocks.org/doc/sip022.html#_3-1-3-header
Request salt in response header: This maps a response stream to a request stream. The client MUST check this field in response header against the request salt.

is the request salt response header attached encrypted?

https://shadowsocks.org/doc/sip022.html#_3-1-4-replay-protection
For salt storage, implementations MUST NOT use Bloom filters or anything that could return a false positive result, because salts only have to be stored for 60 seconds.

why not leaves the same timeout: 30 seconds as the timestamp-check timeout?

go-shadowsocks2

The maintainer of that project (riobard) was opposed to having a successor to Shadowsocks AEAD (2017) in earlier discussions. So this is unlikely to happen. If you want a Go implementation of the protocol, check out https://github.com/database64128/shadowsocks-go and https://github.com/SagerNet/sing-shadowsocks.

why do attach a length chunk of payload for both request or response header, even if with a tcp request? i'm not sure if there means both on tcp or udp connections.

Just write a simple program to use TCP, and you'll understand. Each TCP connection is a bidirectional byte-stream. There are no message boundaries. TCP is not a message-based RPC protocol.

what is the first encrypted payload chunk, before the length chunk?

There isn't one.

is the request salt response header attached encrypted?

Of course it is. It's inside the response header.

why not leaves the same timeout: 30 seconds as the timestamp-check timeout?

Consider the worst case scenario: Client time is 30 seconds behind server time. This results in the largest replay window: 2*30 seconds.

thanks.

what is the first encrypted payload chunk, before the length chunk?

There isn't one.

https://shadowsocks.org/doc/sip022.html#_3-1-2-format
Response stream:
+--------+------------------------+---------------------------+------------------------+---------------------------+---+
| salt | encrypted header chunk | encrypted payload chunk | encrypted length chunk | encrypted payload chunk |...|

though it's the "| encrypted payload chunk |" i see it twice here.

why do attach a length chunk of payload for both request or response header, even if with a tcp request? i'm not sure if there means both on tcp or udp connections.

Just write a simple program to use TCP, and you'll understand. Each TCP connection is a bidirectional byte-stream. There are no message boundaries. TCP is not a message-based RPC protocol.

though i don't know why attaching a payload length chunk for tcp.

is the request salt response header attached encrypted?

Of course it is. It's inside the response header.

suggests that to be named "encrypted salt" on doc.

why not leaves the same timeout: 30 seconds as the timestamp-check timeout?

Consider the worst case scenario: Client time is 30 seconds behind server time.

then man won't be pass the timestamp-check, is it?

Unfortunately this specification conflates the transport and the proxying protocols. They were nicely separated in traditional Shadowsocks:

I'd like to see a transport protocol specification that is separate from the proxying protocol specification, so we can pick and understand them separately.

I'd like to stick to a single proxy protocol, but be able to change transports as needed. For instance, I'd rather use SOCKS5 as the proxying protocol because the proxy protocol specified in SS2022 (and the traditional Shadowsocks) doesn't have a mechanism for the server to report connection errors.

@GF-Huang The plaintext fits in a single AES block, so the raw AES block cipher is used without any block cipher mode.

About AEAD-2022 Detection Prevention, should the client side also need to consider socket-closing strategy? Or just close the socket directly when the first SINGLE read operation can not receive the enough bytes to decrypt the fixed-length header?

3.1.3. Detection Prevention
...

To process the salt and the fixed-length header, servers and clients MUST make exactly one read call. ...
...

In such circumstances, do not immediately close the socket. Closing the socket with unread data causes RST to be sent. This reveals the exact number of bytes consumed by the server. Implementations MAY choose to employ one of the following strategies:
...

should the client side also need to consider socket-closing strategy?

@GF-Huang Good question. I don't think we have any implementations out there that use these strategies on the client side. I guess just closing the socket is good enough, as the read-once requirement was meant as a defense mechanism against bad server implementations, not censors that already interfere with TCP byte streams.

And please don't delete your comment after getting a response from someone else. Doing so would just make the responder look stupid. πŸ˜… Besides, I could totally see someone else having the exact same questions as you did last week, especially coming from a C# background.

@database64128 Thanks your reply.

And please don't delete your comment after getting a response from someone else.

And sorry for that.

By the way, what does SIP023 solves for? I could not see any detail description in SIP023.

By the way, what does SIP023 solves for? I could not see any detail description in SIP023.

SIP023 allows you to serve multiple users with distinct PSKs on a single port.

For legacy Shadowsocks, there are server implementations that rely on brute force trial decryption to support multiple passwords on a single port. SIP023, on the other hand, aims to take advantage of deep integration with SIP022 to be as efficient as possible.

@database64128

SIP023 allows you to serve multiple users with distinct PSKs on a single port.

It sounds like a server-side only spec?

Unlike previous editions, Shadowsocks 2022 requires that a cryptographically-secure fixed-length PSK to be directly provided by the user. Implementations MUST NOT use the old EVP_BytesToKey function or any other method to generate keys from passwords.

This is not good for convenience -- and bad convenience encourages adoption of the worse pre-2022 solutions, making overall security worse.

I get that everyone wants to jump off the SHA1 boat. That's fine, but we could also have used a different password-based KDF. (blake3::derive_key is explicitly NOT FOR password use because it does not have a big work factor.) agev1 uses scrypt; the documentation for blake3::derive_key suggests Argon2.

We can just choose an algorithm agree on a nice and big work factor and that's it. Maybe figure out a way to send the salt, maybe just use with a fixed salt so no new type of data needs to be sent; these type of thing can be slow enough for brute-force anyways.

Was the old thing so bad?

The old thing was... not HKDF_SHA1. EVP_BytesToKey has a very concise implementation in shadowsocks-crypto as openssl_bytes_to_key. It does a new run MD5 to expand the key by 16 bytes, so it runs for at most two times.

So it's really not great: for a 128-bit key, it probably just runs once without salt, not impossible given the now terrible (well, 2^123.4 isn't exactly completely broken, but you don't want to take chances!) preimage resistance. But before Mallory can do that, he needs to get the MD5 hash... which would be our PSK. I don't think that's expected to happen.

And we are talking about something to put on top of the PSK; the PSK can be specified directly in v1 too. I think there is a point in allowing "something that turns into a high-quality PSK of the correct length", and that doing so would not be less secure, that's all.

(The pass-to-hash direction should still be slow to account for the likely reduced search space of passwords compared to a whole random key, if we want to resist an adversary that records whole streams and tries to break them later. Salt exchange would help if we want that.)