cucapra/dahlia

Explicit combine registers

rachitnigam opened this issue · 5 comments

TLDR: Instead of magically transforming let bound variables into combine registers for combine blocks, explicitly mark variables that should be made available in the combine block.

We'd have to write:

    let acc = 0.0;
    for (let i = 0..32) unroll 4 {
      let combine x = f(a[i]);
    } combine {
      acc = acc + x[0] + x[1] + x[2] + x[3];
    }

to make x available in the combine scope. If a variable is not marked combine, it will not be available in the context of the combine.

This also solves another problem where the loop body can define a let bound array which cannot be magically turned into a combine register.

Seems like a reasonable proposal—I have two questions, though:

  • Can you clarify what you had in mind as the problem this is solving (other than the array thing)? Is it about clarity or ease of implementation? (It doesn't actually seem like it simplifies the implementation vs. just giving this treatment to all variables defined in the body and referenced in the combine block.)
  • While the main proposal is about the extra combine qualifier, the example listing makes another change too: referring to the elements of the register explicitly using subscripting rather than implicitly using +=. Is that an intentional conflation, or is that a separate idea?

Here’s what the current implementation does:

  • When the type checking of a scope finishes, all let bound variables are returned to the parent context.
  • For combine blocks, the type checker gives all let bound identifiers a fully banked memory type. The implementation will fail if there are any let bound arrays. I’d like to clarify what kind of let bound variables can be used as combine registers.
  • With #290, the affine checker, which is separate from the type checker, can do one of two things: track all bindings in every scope, or only track the combine registers. The latter seems cleaner.

Answer to second point:

  • you can already write the program above. Since the type checker treats let variables as just memories, it will allow such a use. Unfortunately the generated code will be incorrect since in the C++ code, the let bound variable is just a scalar value.
  • The solution to this would be to explicitly transform combine registers into fully banked memories and using them as array indexes by the unrolled iterator.

Aha, yes, it seems like a good idea to allow let-bound arrays (which are still kind of a weird thing!) even in unrolled loops when they aren't used in combine. However, I suppose these options seem like six of one, half a dozen of the other in terms of implementation cleanliness/complexity:

  • changing the AST to mark variables that are allowed to turn into combine registers
  • "lazily" doing the combine register thing to all variables, only when they are actually referenced in combine

So either one is cool!


Unrelatedly, regarding the other things in the above comment:

  • FWIW, as I recall discussing one some time ago, the "fully banked memory" thing for combine registers is perfectly workable as a hack, but it is kind of a hack. There are a couple of things that make combine registers more like an ordinary tuple than a memory: we probably don't want these things to be mutable, and affine tracking on them is unnecessary. They're more like ordinary values than physical resources.
  • To clarify the plan about explicit indexing: is your current plan to get rid of the += syntax in favor of the explicit indexing? I think it would be cool to keep it, even if we make the explicit version actually work also.
  • changing the AST to mark variables that are allowed to turn into combine registers

The code for this would be somewhat ugly--while traversing a for loop body, if a let-bound variable matches a hidden condition, allow it to be a combine register. Where would the error occur for a bad use of such a variable?

  • "lazily" doing the combine register thing to all variables, only when they are actually referenced in combine

Similar implementation problem where you'd have to capture all free variables in the combine body and then reverse engineer which ones are allowed to be combine registers. Again, it's unclear what the error message should look like. When the variable is defined in the loop body but not allowed to be a combine register, the error should say so. When the variable is free in the for loop body and the lexical context, the error should say so.


I'm not sure if we want to implement this, but I want to propose the following desugaring for combine registers.

    for (let i = 0..32) unroll 4 {
      let combine x = a[i] + b[i];
    } combine {
      acc += x
    }

would turn into:

    for (let i = 0..8) {
       let x: int[4 bank 4];
       for (let j = 0..4) {
         x{j}[0] = a[4*i + j] + b[4*i + j];
       }
       acc += x; 
    }

The semantics of reducers like += is still that they build a reduction tree for the value on RHS. Not sure if there is a fast encoding for this in HLS tools. The += would have to become an explicit adder tree somehow.

The code for this would be somewhat ugly--while traversing a for loop body, if a let-bound variable matches a hidden condition, allow it to be a combine register. Where would the error occur for a bad use of such a variable?

Indeed! But to clarify, this was my attempt to summarize your proposal. 😃 So I suppose the error would be a "variable undefined" error when referencing it in the combine block?


Yes! That is the desugaring I was imagining. This would be an extension to the general desugaring we have discussed at some points where we turn every unrolled for into a nested plain for and pfor (i.e., a fully unrolled loop). Then the combine block, as you point out, could totally go in the outer loop. Nice and clean!