quicwg/datagram

Specify Max Payload Size instead of Max Frame Size

Closed this issue ยท 26 comments

Right now, the TP specifies a maximum frame size, including frame type, length and payload. This makes certain values invalid (0, 1?). Also, since this values practically is a kind of flow control, indicating how much data I'm willing to receive at a time, it's the payload length that's important here, not the framing.

For these reasons, I'm arguing to change this to specifying a maximum payload length. Then, the question of what a value of zero means. Should a value of 0 be the same thing as not present or should it mean that only 0 length datagrams are allowed? I think it is simpler to say that a value of zero is the same as not present (i.e. disabled).

(Issue copied from individual draft repo, by @nibanks on 2019-11-18)

Comment from @Ralith on 2019-11-18:

Wasn't the legality of zero-length frames agreed upon in tfpauly/draft-pauly-quic-datagram#19?

I didn't even realize that the parameter wasn't payload size. Strongly agree that it should be.

Comment from @mikkelfj on 2019-11-19:

I agree that is simpler to define zero to mean not allowed, but it is highly confusing to be forced to require a length of one if you specifically only use 0 length datagrams for heatbeats or similar. It would be better to a have separate indicator to reject datagrams altogether.

It would be better to a have separate indicator to reject datagrams altogether.

Isn't this indicated by omitting the transport parameter entirely?

@Ralith If omitting the parameter indicates that data frames are not permitted, then this achieves the purpose, yes. So there is no need to have 0 means disabling. Hence it is better to have 0 mean 0 length, but still valid. This is the most natural representation, and there is a an actual meaning ful use of these, namely heartbeats. Further, by making it explicit that length 0 is the only valid, you can have a fast or lightweight implementation.

I wonder if we need this limit at all. It introduces implementation complexity and it's unclear why any implementation would like to limit the size of DATAGRAM frames (or payloads) they can receive. We already have max_packet_size for implementations that have a limit on their stack size.

Does anyone have a use-case where they'd like to limit the size of DATAGRAM frames (or payloads) specifically?

I'm all for removing the limit. As I have it coded up in MsQuic, I always send a max value. It's not even configurable by the app. I don't think "I have a buffering limit" a good reason to have a limit. It's limited by a single packet already. That should be enough IMO.

I don't think it's particularly useful to set this smaller than a packet, given that a reasonable buffer is bigger than that, and I'm not sure what a peer could usefully do with the information that the limit is larger than a packet. ๐Ÿ‘

The more I think about it, limiting the size of the QUIC DATAGRAMS (frames or payloads) is weird. In part because there is no wiresignal to describe a max number of DATAGRAMS per packet, and we seem to have discounted the need for an internal API to configure how frames are composed in a packet. I don't hear anyone with use cases; removing the limit removes a whole load of edge cases and/or implementation defined behaviour. I think the simplification would be a win.

@DavidSchinazi One reason you might want to limit Datagrams is if you have a slottet buffering system for datagrams, for example 1K, or 256 bytes per datagram, or a paged system of multiple datagrams. In this case you cannot always fit a whole package into the buffering. For streams you have an entirely different mechanism, and other frames are generally not large.

@mikkelfj that's an interesting thought - is this something you're planning on implementing and have a use for, or is it more of a thought experiment?

@DavidSchinazi Unfortunately I've had to push back on my QUIC dev, but I did work with an emulation of netmap, on top of select / poll, that uses slots to receive packets from the OS without context switching. Sometimes a packet needs to span multiple slots, but often it just drops into a 2K slot or so. I could imagine many use cases where datagrams are redistributed to other threads or processes in a similar manner without having an explicit use case. I believe there are now several kernel interfaces that uses a similar approach.

As far as I know, those kernel interfaces all use slots that are larger than the interface MTU, so I don't think they'd require reducing the payload size. Am I missing something?

Well, I can only speak for netmap, but it works by having the user specify one large memory block divided into a number of equal sized slots of a user specified size. Each slot has a control record with some flags. A record can indicate that the slot is partial and reference the next slot in the packet. So you can can choose a slot of 64K and waste a lot of memory, or choose a slot of 512 bytes and frequently have to deal with fragmenting. Or chose 2K for a use case where you know that you will not have larger payload and can afford the space overhead.

EDIT: to clarify, it is the network driver that chooses to link data and update the control record when receiving, and the user process when writing.

I'm not sure exactly how the other kernel interfaces behave but from memory it is rather much the same.

For QUIC datagrams it is a bit different in that you likely want to avoid fragmentation at all cost due to the added complexity and the non-stream nature of the data. If you could afford fragmentation it would make less sense to have a limit on the size.

I support switching to TP specifying max payload size.

I wonder if we need this limit at all. It introduces implementation complexity and it's unclear why any implementation would like to limit the size of DATAGRAM frames (or payloads) they can receive. We already have max_packet_size for implementations that have a limit on their stack size.

IIRC, the original reason this limit was introduced is to allow QUIC proxies to communicate the max datagram size it gets from the MTU between itself and the backend.

In my experience, a MASQUE proxy doesn't know that information at the time it opens a listening QUIC connection., and there is nothing to say that all backends would share the same value.

This is not for MASQUE, but for a direct QUIC-to-QUIC proxy.

OK, do you expect the max DATAGRAM frame size a backend expects to receive is smaller than the QUIC packet size it is willing to receive?

If we do have a max frame/perhaps size still defined, perhaps it makes sense to let this be updated by a new frame (MAX_DATAGRAM_SIZE).

A MAX_DATAGRAM_SIZE frame would be tricky because of reordering. In QUIC, all the MAX_FOOBAR frames can only increase limits, but here I suspect for this to be useful we'd want to be able to lower the limit which greatly increases complexity.

OK, do you expect the max DATAGRAM frame size a backend expects to receive is smaller than the QUIC packet size it is willing to receive?

I expect this to be fairly uncommon, but whenever this does happen, it would make the entire connection unreliable for datagrams, meaning the applications would have to do their own MTU discovery on top of the one QUIC already does.

The default ought to be "whatever fits in the PMTU".

Yes, you have to do PMTUD. (What is pernicious here is that if you get too good at that, the value that you resolve to might not be good forever because your varints will use more bytes over time and so reduce available space.)

@martinthomson which varints will use more bytes here?

I was thinking about flow IDs. If you are not reusing flow IDs, there is a good chance that those will get bigger over time. But I guess that assumes use of flow IDs.

Mostly, I guess that DATAGRAM can be each sent in their own packet/datagram, so the size of other frames won't affect the space that is available.

Yeah the flow ID is part of the payload as far as this document is concerned.

Our implementation will try to coalesce DATAGRAM frames with other frames, but that doesn't impact the max possible payload size because if the DATAGRAM frame doesn't fit with other frames we just send it in its own packet.