Race-free Parallel Connections
emil14 opened this issue · 2 comments
Current runtime implementation would spawn goroutine for each pipe/fan-in connection. Even though (direct) fan-in connections are race-free, there's still race-condition in parallel connections.
Simplest picture of the problem. s1
, s2
, r1
and r2
are all ports:
s1 -> r1
s2 -> r2
Because s1 -> r1
and s2 -> r2
are (concurrent) goroutines there's a race-condition. What it means is that this situation is not impossible:
- 2 goroutines were spawned to serve these two pipelines
s1
sends tom1
message tor1
and blockss1
sends tom2
message tor2
and also blocks- go scheduler activates second
s2 -> r2
goroutine and receives message froms2
to send it tor2
(and blocks) s2
unblocks- go scheduler activates first
s1 -> r1
goroutine, and receives message froms1
to send it tor1
(and blocks) s1
unblocks
At the step 3 we can see race condition. There's some order set by senders: 1) s1-> 2) s2->
and that order needs to be preserved in 1) ->r1 2) ->r2
.
It's very important to point that even if this is preserved, there's no guarantee that r1
will actually receive before r2
. There are multiple reasons for that starting from r1
could simply be slower than r2
, to go scheduler activating r1 and r2 in different order. That's ok, we don't have control over these things.
What we want guarantee is, again, that we send to receivers in the same order as we received from senders.
Now it's a lot of questions like:
- Is this always necessary? Could it be limited to only ports of the same node? E.g. what is the case for
a -> n1:x; b -> n2:x;
where it's critical thatn1
receives beforen2
. - If it's necessary (there are cases where >=2 nodes needs order), could it be handled thought locks, HOCs?
- For what's necessary - which options do we have that meet these requirements? What are their upsides and downsides?
Very close to #689
Channel is Inport, not In/OutPort (+ Back to Graph Reduction)
We don't have fan-out at runtime anymore which means it's possible to move from old design (channel per port) to new - channel per inport. For pipeline connections that would mean that channel is connection itself, for fan-in connections that would mean that several runtime functions write to the same inport-channel.
This would solve the issue we are talking about at all levels (array-inport, several inports, even several nodes). If we don't have goroutines that do delivery (receive from sender, send to receiver) we simply can't have race condition. There's no "delivery", runtime functions (senders) just writes directly to their receivers (other runtime functions).
Graph Reduction
If there's no delivery goroutines - all message passing is done by runtime functions themselves. We already did almost this with single-queue for all outports. This time we are going even further - outports don't represented by channels at all. Only inports have channels. Outport directly writes to inport. That's pipeline. For fan-in - it's already naturally works - several runtime functions write to the same channel. It's free serialization. We don't even need atomic counters for that. (We still might need counters for array-inports serialized receiving though - that's the situation that is exactly the same as we have with existing fan-in (several senders, one receiver).
However, that means we need graph reduction. It's not (again) optimization pass, but rather requirement for program to work.
Deadlock
We can get the same situation as we had with single queue solution. Component can write to its outport by actually writing to its own inport if we have cycle. This however could be in theory solved by having buffer with at least 1 message, but we need to think about that more
All Node's Inports Use Single Queue
...