nevalang/neva

Out of Order Delivery

emil14 opened this issue · 1 comments

If all you want is to understand the issue, you can skip the Background (it's an interesting concurrency journey though!) and start from the description of the problem.

Background

It all started with #575 - simple task like "print each item of a stream and then terminate" turns out to be not that simple, sometimes program was terminating too soon.

Our first thought was that we need higher order components that can do proper iteration and terminate predictably by utilizing locks internally. Long story short - it turns out to be a rabbit hole.

We added new shiny components like For and Map and even starting to dream about Filter and company. Simple examples was working fine and everything seemed great until @dorian3343 showed couple of little more complex examples where problem arrive again. Honestly at that point I was frustrated. This is what #635 was about.

After several pair programming sessions with @Catya3 we found out that problem is not about For or Map or any other component. It's deeper. It's about runtime.

Fan-In Problem

This is how #644 started. TLDR: runtime operated in terms of outgoing connections. It means that at runtime connection was sender -> receiver[]. In other words, network was a fan_out map. And for each such connection we have a goroutine.

In Neva network graph is a many-to-many relation. Not just one sender can have many receivers but also one receiver can have many senders. It means that not just fan-out (split) but also fan-in (join) is possible. And that's a big problem was this design. It means we can have 2 concurrent goroutines sharing the same receiver.

The simplest case is 2 one-to-one connections like this:

s1 -> r1
s2 -> r1

s1 might send to r1 before s2 but (if sending happens fast enough) there might be race (because goroutines are concurrent). Go scheduler might activate s2->r1 goroutine before s1->r1 and r1 will receive messages out of order. By "out of order" I mean order that is different that order of sending.

First Attempt To Solve (Single Queue)

Goroutines was source of race condition. We need to get rid of them and that will fix out of order right? Well yeah but...

After short thinking decision was made - let all outports send to single queue. If s1 sends to queue it blocks s2, s2 awaits for s1 to send. Then there's a one single goroutine that reads messages that from queue and delivers them to corresponding receivers. This is a relatively simple design but it has 2 problems:

Problem 1. Intermediate Connections

Neva program usually include not just "atomic" components (runtime functions) but also "complementary" ones - components implemented in Nevalang itself. They are only ports and connections at runtime. No runtime function is spawned to represent them. They "do nothing" except message passing.

Let's say we have this program:

flow Main(start) (stop) {
  nodes {Printer}
  :start -> printer -> :stop
}
flow Printer(data) (sig) {
  nodes {Println}
  :data -> println -> :sig
}

In this example Printer is complementary component and Println is atomic component. Question is - if we don't have goroutine per connection then who will receive from printer:data? Let's say runtime itself writes message to queue, then we do lookup and see that printer:data is receiver for :start and write message there. And... nothing happens. This is "intermediate connection" - connection that does not lead to runtime function receiver. And this design can't handle this.

Problem 2. Deadlock

Let's pretend there's no buffer for simplicity. Simplest case s1 -> r1 -> :stop

  • s1 sends to queue, queue receives and tries to send to r1
  • r1 receives and starts its work
  • s1 writes next message to the queue and blocks
  • queue tries to write next message to r1 and blocks until r1 is ready
  • r1 finishes its job and tries to send to queue and blocks because queue tries to send to r1 - deadlock

Then we thought maybe we can avoid having single queue for all outports and only enforce for senders involved into fan-in pattern? Nope, that won't work. Turns out we can still have deadlock if loop is involved. (I won't dig into the details here, I believe you can find them somewhere in the issues).

Long story short - every outport must have it's own queue (channel). This is the only way (I see) to not have deadlocks.

Second Attempt To Solve (Serializable Messages in Fan-In-Centric Algorithm)

First problem was solved by implementing graph reduction algorithm - by removing all intermediate connections from a program we could get "reduced" version (smaller size but functionally equal) where runtime functions do message passing without intermediate hops. This was idea for performance optimisation for a long time. Who could think we would need this just for correctness? And that worked! Of course it didn't solve problem with deadlock though.

For second problem no other solution could exist but rewrite runtime again... This time with these 2 requirements:

  1. no shared queues - each outport is a separate go-channel (just like before first rewrite)
  2. fan-in centric algorithm - use outgoing connections map instead of incoming one and somehow serialize messages before sending to receiver

The idea was this - each time when we send a message, we increment atomic counter and assign it's current value a sent message. This way each message can have its own unique and chronologically accurate comparable identifier. Then, each time we have more than one message available for delivery we sort them by this ID and deliver to receiver in an order that is the same as order of sending.

Before atomic counter there was an idea to use UNIX timestamp (nanoseconds). Turns out computer can work so fast that several messages could be sent in single nanosecond. It means we can't tell which one was sent first (even though there was clear order).

Design that meets these requirements was implemented and shipped in #670

The Issue Itself (Out of Order Delivery)

After merge of #670 fan-in issue is finally fixed. However, there are still issue. The new issue ironically caused by increased performance. Issue we're talking about can only occur when things happen fast enough and old version of runtime was slow enough so we couldn't see it.

To understand the issue let's look at this little program. This slightly desugared version of e2e test "for with range and if". I replaced deferred connections with explicit locks, it just easier to understand what is actually is happening:

import { lists }

const lst list<bool> = [true, false]

flow Main(start) (stop) {
	nodes { Iter, For{Printer} }
	:start -> ($lst -> iter -> for -> :stop)
}

flow Printer(data bool) (sig any) {
	nodes { If, Println, l1 Lock, l2 Lock }

	:data -> if

	1 -> l1:data
	0 -> l2:data

	if:then -> l1:sig
	if:else -> l2:sig

	l1 -> println
	l2 -> println
	
	println -> :sig
}

It's clear that we send true and then false. Does it mean that we must see 1 and then 0? Turns out it doesn't!

Note that we have fan-in here:

l1 -> println
l2 -> println

But it's not the problem. With new implementation we now for sure that if l1 sends first then println receives first from l1. Problem is - we not sure l1 sends first!

Let's look at upstream connections

if:then -> l1:sig
if:else -> l2:sig

We know for sure that if:then sends first and if:else sends second. Does it mean that l1 receives before l2? No! These are parallel concurrent connections! l1 might be "busy" even though sending from if:then happened before. There could be different reasons for that e.g. Go scheduler suspend runtime function that reads from l1 and for some reason decided to activate l2 before l1. It's better not to think about that. That's just fundamental indeterminism.

It's honestly not clear what to do about that. Situation with fan-in was clearly a bug but this... I have a feeling that the program itself is incorrect. Of course we could give up on ordering but that would mean Neva isn't really general purpose. I think we can do better than that.

Possible Solutions

  1. Another modification of a runtime - for this we need an idea first and for now I don't have one. Even more than that - it looks like language is ok, the program is incorrect. But how to program this is, again, unclear
  2. Modification of standard library - maybe we need more/different tools to be able to enforce ordering? Maybe some variations of Lock?

Here's simplified example of the problem

Screenshot_2024-07-04_at_17 36 34