snapview/tungstenite-rs

Allow sending of shared data

Closed this issue ยท 35 comments

I think that it would make sense that Message be instead:

pub enum Message {
    Text(Arc<String>),
    Binary(Arc<Vec<u8>>), // maybe taking `Bytes` from the `bytes` crate would be cleaner.
    Ping(Vec<u8>),
    Pong(Vec<u8>),
    Close(Option<CloseFrame<'static>>),
}

This is because the websocket transport cannot guarantee messages to not be lost (from what I understand). Thus someone who wants to prevent messages from being lost would keep a list of messages and only consider them "sent" when they are ack'd by the peer. Using a Arc would greatly reduce the cost of doing so since clone operations would be more efficient.

Hmm,

given your other issue, I understand your concern, but I think your usecase is rather rare. Thus putting everybodies message data always in an Arc will incur an overhead all the time for everybody that doesn't need this.

I would suggest that a connection loss is an exceptional and erroneous situation, in which case it's generally acceptable to have a performance cost. What I'm suggesting is that you regenerate the messages that haven't been acknowledged in this situation. This can happen concurrently to establishing a new connection.

That being said, maybe using a data type like Bytes instead of Vec<u8> would open up options for copyless scenarios, including this one? Not sure about the performance trade-offs with Bytes.

Thus putting everybodies message data always in an Arc will incur an overhead all the time for everybody that doesn't need this.

I would suggest that a connection loss is an exceptional and erroneous situation, in which case it's generally acceptable to have a performance cost. What I'm suggesting is that you regenerate the messages that haven't been acknowledged in this situation.

In my experience, a TCP crash is only a matter of time, especially for long running connections (16bit checksum, etc.). Thus if you want reliable messaging you need to keep a copy of each message locally until it is ack'd. Currently, you need to clone a Vec<u8> each time to do so, which is a massive additional overhead just to have reliability. So while it is true that the crash itself is not part of the hot path, the mechanism to ensure reliability very much is.

That being said, maybe using a data type like Bytes instead of Vec would open up options for copyless scenarios, including this one? Not sure about the performance trade-offs with Bytes.

I know that this is the approach that tokio takes(look at Bytes and BytesMut). The Bytes is an Arc<u8> plus some other convenience stuff. It would be interesting to benchmark to determine if this has a significant performance impact. My hypothesis is that the cost should be minimal compared to the I/O overhead.

Currently, you need to clone a Vec<u8> each time to do so, which is a massive additional overhead just to have reliability.

Sure, if you can't regenerate them just in case of failure, that's not nice for memory consumption.

What I hope by suggesting using Bytes is that we might get ecosystem wide possibility of copyless end-to-end. From where the data enters the process to where it leaves. There is an interesting new proposal for the AsyncRead/AsyncWrite traits in a PR on tokio that could really make that happen.

There is an interesting new proposal for the AsyncRead/AsyncWrite traits in a PR on tokio that could really make that happen.

That's cool, I'll take a look.

@najamelan What are your thoughts on having a asymettric Message type? Something like:

pub enum SendMsg {
    Text(String),
    ShText(Arc<String>),
    Binary(Vec<u8>),
    ShBinary(Bytes),
    Ping(Vec<u8>),
    Pong(Vec<u8>),
    Close(Option<CloseFrame<'static>>),
}

pub enum RecvMsg {
    Text(String),
    Binary(Vec<u8>),
    Ping(Vec<u8>),
    Pong(Vec<u8>),
    Close(Option<CloseFrame<'static>>),
}

This would enable zero-copy scenarios while not forcing the user to use Bytes.

An alternative would be to provide a separate WebSocket::write_shared_message API. This would preserve backward compat.

enum SharedMessage {
    Binary(Bytes),
}

Can't we instead change Message to:

pub enum Message<'a> {
    Text(&'a str),
    Binary(&'a [u8]),
    Ping(&'a [u8]),
    Pong(&'a [u8]),
    Close(Option<CloseFrame<'static>>),
}

This would allow to cache messages if needed without cloning data and will allow users to remove unnecessary allocations, e.g. if messages are fixed or with if reusable buffer is used.

The same type can be used for receiving messages as well, with the following signature changes:

// this function will use an inner buffer stored inside `WebSocket` instance
pub fn read_message<'a>(&'a mut self) -> Result<Message<'a>> { .. }
// we also could add function like this
pub fn read_message_to<'a>(&mut self, buf: &'a Vec<u8>) -> Result<Message<'a>> { .. }

This would allow additional optimizations, although it may complicate code a bit (especially for async).

Not for receiving. The incoming message may be split across multiple packets. It may work with Cow but would still require copying in many cases.

But it would work with read_message_to, correct? I think introducing an additional "reference" message type (in addition to the current "owned" one) and additional read/write methods for it is a better option compared to introducing additional message variants.

In the current implementation it would be not more efficient than just copying the vector afterwards. We keep the half-received message in a Vec and return it by move as it gets complete.

I also proposed some idea in a comment to #104, perhaps it would also be a solution.

I'll give #104 another try in a couple of weeks. I'm thinking about writing a decent benchmark to view the overhead of using Bytes over a Vec<u8>.

  • If the overhead is significant I'm thinking of using @application-developer-DA's approach and

    [modify the] Message structure, so that the Message::Binary accepts some Payload type (and we can provideFrom implementation to support Bytes and Vec) [...]

  • If the overhead is minimal then maybe replacing Vec<u8> directly by Bytes makes sense.

Note that both cases would lead to a breaking change.

x1957 commented

Is there any updates?

I think replace Vec<u8> by Bytes is a better choice, even if there is a breaking change.

@x1957 , awaiting a feedback from @agalakhov on #104 (review)

Hello, i am new to this and very interested. Maybe it would be possible to add a Shared variant to the enum which just stores an Arc<Message> this would have the downside that a shared message could be "reshared" but it wouldn't be too hard to implement and then the Arc could handled internally. This also wouldn't incur overhead for everyone but only the users of the shared variant. This crates API is very handy, in my use case i want to create broadcast messages using websockets and cloning all the data is noticeable overhead.

There is #104 which is still open.

Perhaps the best and simplest solution would be the one that has already been suggested by contributors in this issue, e.g. usage of Bytes instead of Vec<u8>.

Hello. Broadcasting data to multiple connections like (10k connections) would be very inefficient since we need to copy message 10k times. I vote for solution with Bytes

I also like using Bytes instead.

I looked at the source code, and Bytes::from(Vec<u8>) is indeed copy-free. There is no allocation at all if capacity == len, and if not, there is an allocation for control block.

adryzz commented

Hi! i'd like to try working on this issue if possible, but i'd need a few clarifications to implement this properly:

  • Is this a breaking change?
    e.g. should the use of Bytes be a replacement or another option for the Binary message type?

  • Is only the Binary message type affected or should something be done with strings too (e.g. with Arc<str> or similiar)
    I'm asking this because currently on the String we call .into_bytes(), which consumes the string, and thus wouldn't work on an Arc<str> or Arc<String>

  • What about receiving a message? should that use Bytes too? (and if not, should the message type be asymmetric?)

thanks

Hello, would it be possible to decide on this? as it's opened for years despite it's not complicated and several users propose to do it.
In addition to the broadcasting use case, it also allows to split a big binary message into several smaller message frames without extra allocations, as subviews of Bytes dont allocate anything.
Two solutions:

  • limited breaking change, but extra allocation on client side when applying mask (as Frame doesnt own the data): change Text type to Arc<String> and Binary to Bytes. Users who used the public functions of Message text(String) and binary(bin) wont be impacted.
  • existing users will see no impacts : add a SharedText(Arc<String>) and a SharedBinary(Bytes) to Message that would need to be filled manually by the user who wants this to send a message. Those who currently use the Binary and Text wont be impacted, and these fields wont be used for messages received.

Common thing to do: out_buffer of FrameCodec should be a vector of enum(string,vec<u8>,bytes,arc<String>) (or only enum(arc<String>, Bytes) with first solution) with the total size computed to compare with out_buffer_write_len, and write_out_buffer would just loop on this.

Solution 2 seems better, as first one can extra allocate and it's a breaking change in the public enum.

by the way, i did have an incomplete branch adding the breaking change, i'll see if i can resurrect it, completely forgot about this

@francois-random, I see your point. I also agree that it's a bit embarrassing that we did not clearly communicate a decision, sorry for that.

Regarding your suggestions, let's put it like this: as long as there are no objections or strong opinions from those who made significant contributions rel. to the performance and/or even created crates that depend on tungstenite/tokio-tungstenite (@alexheretic and @sdroege come to mind), I would be okay with any of the proposed strategies (I'm also okay with a small breaking change) as long as the implementation and the API surface remain clean/easy to understand and the performance penalty is negligible.

I'd have to see the exact implementation to say for sure, but it generally sounds good to me and for me the Vec<u8> for Binary was also a limiting factor in the past. Bytes seem like a clear better alternative.

For the string case, Arc<String> (why not Arc<str> though?) have some trade-offs. No strong opinion there.

We could try to resolve this using traits and making Message generic over underlying types. Something like

pub enum Message<S: AsRef<str>> {
    Text<S>,
    ...
}

type RecvMsg = Message<String>;

The original idea behind the interface is to make memory allocations explicit: "If there has to be clone() somewhere, let the user write this clone() explicitly and don't hide it within the library to avoid false sense of "zero-copy" approach".

AsRef<str> sounds like the best option, because it allows for any type that is Deref<str>, like String, &str Arc<str>, Rc<str> and more

making Message generic over underlying types

Worth a try but this often makes the API harder to use. You'd need two types (one for String, one for Binary) and it would probably make sense to default them to String / Vec<u8> to make porting of existing code easier.

Another potential problem with generics that comes to mind is that essentially it goes in the direction of having separate types of Message for reading and writing; this means that at some point, when porting the code of tokio-tungstenite, we either lose the ability to use split() and things alike, or we are forced to use the same type parameters for the message (meaning that we use a single type like we currently do but in a less ergonomic way).

Working with shared data effectively means using different types for reading and writing. Using Arc everywhere "just in case" means messing with atomics and CPU cache even if we don't need to, and is generally bad in single-producer, single-consumer applications where we definitely want to have a Box or even more efficient static buffer data. Some applications want a Rc and not an Arc. So having a generic over Box/Rc/Arc seems to be what is actually desired. The question is, how do we make it ergonomic?

We know how to make it. An enum with both OwnedData (String for text and Vec for binary, same as now) and with SharedData, like solution 2- I mentionned earlier, OwnedData will be used in all case by the lib to fill the message when it's received, and it's up to the user to decide if he wants to send an OwnedData or a SharedData.
Not thate any generic for this (if it's decided) can concern only SharedData, so split() with tokio-tungstenite would still work.
The question is what are the cons of that ? Is it really less readable or more complicated for the user or..?
In the case of a generic used for SharedData, are there lifetime issues?

This is similar to Cow, isn't it? I personally prefer enum within enum:

pub enum Text {
    Owned(String),
    Shared(Arc<str>),
}

pub enum Message {
    Text(Text),
    ...
}

This is more idiomatic and could be better when (and if) we eventually port Tungstenite to no_std.

No strong preference for me but perhaps your solution with enum of enum is more understandable from the library user's point of view, as it keeps one enum type per kind of websocket message. And inside the type, you choose Owned or Shared.
The easier would be to postpone the use of Asref as a shared enum type and to stick on Bytes & Arc for the moment.

qRoC commented

messing with atomics and CPU cache even if we don't need to

For construct Arc no need atomics, so

- Text(String),
+ Text(Arc<str>),

is free for users who enough String functionality:

-Message::Text(s)
+Message::Text(Arc::from(s)) // free

For users who need Rc (users with single thread runtime) best is add feature like tokio rt-multi-thread:

- Text(String),
+ #[cfg(feature = "rt-multi-thread")]
+ Text(Arc<str>),
+ #[cfg(not(feature = "rt-multi-thread"))]
+ Text(Rc<str>),

pub enum Text {
    Owned(String),
    Shared(Arc<str>),
}

it's worse than just casting String -> Arc

@qRoC there is extra allocation on client's side if you force all text messages to be 'Arc' because you will need to apply a mask and it requires the message to be mutable (it will be the same if you also force all binary message to be Bytes).
This is also a limited breaking change from the library user's point of view.
You're talking about solution 1 that I mentioned one week ago and @agalakhov talks about a slight modified solution 2.
To me, 2) is still better.

By the way, Bytes type has now https://docs.rs/bytes/latest/bytes/struct.Bytes.html#method.from_owner ,
to allow construct it from almost any types that own &[u8]. So

pub enum Message {
    Binary(Bytes),
    Ping(Bytes),
    Pong(Bytes),
}

would be good fit send any data in form of [u8] via Message without any extract allocation and copy.

Implemented in #462 / #465