cucapra/dahlia

Parallel assignment to registers

Closed this issue · 4 comments

What does this program mean?

let j = 0;
for (...) { 
  j := j + 1;
  A[j];
}
j := 0;
for (...) {
  B[j];
}

One interpretation (which is inconsistent w.r.t to C) is that at every cycle, j is incremented by 1, get A[j] and then j is set to 0. Obviously, ; makes no guarantees about which branch executes first which means any interleaving of the above is possible.

Is there a restriction on local variables that's missing? Should locals not escape to the next parallel block?

The way I've been thinking about it is that, even under parallel composition, programs need to respect their data-flow meaning. In other words, ; gives statements permission to run in parallel, but it doesn't require them to—and, in fact, they only get to run in parallel up to the point where they wouldn't respect the original program's data flow.

To be very reductive about it, that's the same reason this program would be deterministic:

let j = 0;
j := 1;
j := 2;

The uses of j within the parallel composition above should also respect their data-flow meaning.

@sampsyo explanation makes sense to me.

A followup for @sampsyo : It seems that in this case the compiler can make a trivial optimization where the second assignment of j is turned into j2 allowing for parallel execution.

In general, it seems we want the program to be in SSA and aggressively parallelize when these dependencies don't exist. (For example, this is still not parallilizable if the j := 0 was j := j + 1.)

Yes, you can imagine breaking anti-dependencies like that using an optimization. But I don't think such an optimization breaks the mental model we've been using: that the program is guaranteed to execute in such a way that you can't distinguish it from the serial execution, i.e., all the data-flow edges are respected.