Re-consider separate, ordered mailbox
thomaseizinger opened this issue · 10 comments
With #85, we are introducing 3 mailboxes: ordered, prioritized and broadcast.
Ordered exists because BinaryHeaps
don't preserve insertion order and thus, messages with the same priority (like 0 as the default) could be processed in a different order, assuming they are used with split_receiver
and the processing thus happens async.
I am claiming that this is way too much of a detail to expose to the user (which has to eventually learn that there are 3 mailboxes) so my suggestion would be to:
a) Use a data structure that preserves insertion order
b) Accept the re-ordering of messages
I am not data structure expert but couldn't we for example also use an atomic counter to tag each incoming message and sort the heap based on priority and message counter? That should preserve the sending order even among messages with the same priority.
Couldn't we for example also use an atomic counter to tag each incoming message and sort the heap based on priority and message counter?
In theory, yes, but how do you propose to handle overflows? This was the main issue I ran into thinking about this option
Couldn't we for example also use an atomic counter to tag each incoming message and sort the heap based on priority and message counter?
In theory, yes, but how do you propose to handle overflows? This was the main issue I ran into thinking about this option
Reset it every time the queue is empty perhaps?
Am I too naive in saying that we could also just not handle them? The "problem" would be a slight out of order handling of same-priority messages.
From experience, actors receive messages from various addresses concurrently, so the exact order in which they will be handled is unclear anyway so this wouldn't even be noticeable.
But if we accept this behaviour in an edge case, we might also just accept it all the time :D
b) Accept the re-ordering of messages
Expanding more on this:
Usage of split_receiver
makes message delivery asynchronous and I think by definition, timing of events doesn't matter in an async system. (I've found this but it doesn't have many follow-up references.)
In other words, I would argue that someone using split_receiver
explicitly gives up control over the ordering of their messages.
We may want to debate the name of that function again because it doesn't communicate that right away, only conceptually.
If we would implement #92 (comment), then this issue would go away because the BinaryHeap
could be replaced by a VecDeque
per message priority.
@Restioson What is your latest position on this?
I am gonna try and summarize our current knowledge:
Separate handling of messages with default priority has the advantage that they will be delivered to the actor in the order they have been sent even if the user uses split_receiver
.
I see two problems with this feature:
- We are special-casing only the default priority. If I do the same thing with two messages of priority 5, they are not getting the same treatment.
- The potential re-ordering only happens if the user chooses to await the handler invocation asynchronously. I think by definition, that means they don't care about the order in which these messages are handled, otherwise they would have chosen synchronous execution, right?
From a user perspective, the feature is incomplete (only applies to default priority) and from a maintainer perspective, it adds a fair bit of complexity to the codebase which is why I'd like to remove it.
Friendly ping @Restioson :)
I propose to merge ordered
and priority
into a unicast
mailbox. I think this can wait for after 0.6
as it doesn't affect the public API although it would be nice to do before given that the "priority" feature is only introduced in master and doesn't exist in 0.5
.
Hi, sorry for getting to this so late. I am back from holiday now :)
I think the main thing around having a unified mailbox would be that we would need to find some way to implement the two features (priority & no priority) in one fell swoop while still retaining message ordering, as there are some existing usecases which use message ordering. We could try the VecDeque solution and see what the performance/memory usage impact would be versus the current solution. I do anticipate it would use more memory and allocate more, needing one new VecDeque per priority that the user is using. Maybe there are other priority queue structures that also preserve insertion order?
Hi, sorry for getting to this so late. I am back from holiday now :)
I think the main thing around having a unified mailbox would be that we would need to find some way to implement the two features (priority & no priority) in one fell swoop while still retaining message ordering
As I tried to argue above, message ordering is guaranteed if you await the response of the message. If you say split_receiver
, you are in asynchronous land where ordering goes out the window by design.
Can you explain what ordering you want to retain?
Also, I think we shouldn't look at it as no priority vs priority. Priority 0 is just another priority and not no priority :)
Here are meeting "notes" from an out-of-band meeting: https://hackmd.io/@AeK4eNTZTGGJXAT4cxS_Zw/r1MvCrHCi.
The conclusion of the meeting is:
- We agree that the ordered & priority mailboxes should be merged:
- It should be easier to understand for users as there is no explanation needed why initializing with
capacity
may end up consumingcapacity * 3
memory. - It removes the feature disparity where only "default" priority messages retained ordering
- It should be easier to understand for users as there is no explanation needed why initializing with
- We want to have a function like
split_receiver
in the library because:- Back-pressure is important. We cannot just tell users to
tokio::spawn(address.send())
as that breaks the back-pressure chain if we f.e. read messages from a socket. - Spawning requires knowledge about the executor and we want to be an executor-agnostic library. We also want to make it possible for users to further build executor-agnostic libraries on top of
xtra
which would not be possible ifsplit_receiver
-like functionality is not offered out of the box.
- Back-pressure is important. We cannot just tell users to
- Merging the mailboxes removes the "ordering" guarantee in combination with
split_receiver
. We agreed that we need a better name for the function to convey this:- Ideally, the name is goal-oriented. What happens is "background-processing" or "asynchronous handling" but "async" is a very overloaded term within the Rust space so we would like to avoid that, esp. because
xtra
defines itself as being an "async-first" library but this is aimed at the support forasync fn
handlers. - The name should imply or at least suggest that the actor may handle several message sent using that function in any order.
- Ideally, the name is goal-oriented. What happens is "background-processing" or "asynchronous handling" but "async" is a very overloaded term within the Rust space so we would like to avoid that, esp. because
Some naming suggestions that came up during the meeting:
send().queue()
: Suggests that we get a space in the mailbox but that processing is separate. It also has a co-notation of ordering though because most queue's are FIFO which pretty much rules this one out. It is also weird thatqueue
is a "configuration" ofsend
and not an alternative. We do however want to retain the simplicity ofAddress
only offeringAddress:send
andAddress::queue
.send().fire_and_forget()
: Suggests that we will not wait for the response which is in line with what we want. It is kind of odd that you can still get the response by waiting for the returnedreceiver
.send().eventually()
: Suggests that processing will happen "eventually" which has not notion of ordering. There is no prior art that I know about so this one may be very hard to discover.
An idea that just came to my mind:
addr.send().spawn()
:
spawn
has prior art in the ecosystem of doing something "in the background".- Both threads and tasks can await the result through a handle returned from
spawn
, same as the receiver returned fromsend()
.
What may not be intuitive is that people might think "what executor / thread is this going to be spawned on?" when in reality it isn't actually "spawned" as such but just queued, albeit in a non-FIFO queue.