calyxir/calyx

Identify loop induction variables

Opened this issue · 5 comments

When using repeat statements in Calyx, we currently end up creating two registers: one that tracks when the loop needs to exit (inserted by the compiler), and another one usually inserted by the frontend to iterate over some value.

In general, it seems useful be to able to identify loop induction variable so that we can share them in passes. They can also be used for a host of other optimizations.

To do this correctly, we need to use live-range analysis and detect the live range of registers. Specifically, the register:

  1. Needs to be overwritten before the loop starts
  2. Needs to be overwritten before the end of each loop body

I might be missing other constraints so let's look through class SW loop induction variable detection and make sure we are getting all the corner cases.

This sounds potentially quite useful! It seems like there are two big options here for the core problem of reducing that obvious redundancy:

  1. Analyze the program to identify registers that behave like loop induction variables in repeat loops.
  2. Expose the underlying counter register inside repeat loops. That is, you'd spell it repeat(reg, n) { ... } or something where reg is a register of size log(n). Then we know the basic loop induction variables "by construction" instead of needing to discover them.

The trade-offs here are standard: one makes the compiler more predictable and less mysterious, while the other leads to a more regular IL that simplifies control-flow handling in general.

If I redesigned Calyx, I would make registers and wires abstract in the IR like @cgyurgyik did in the MLIR dialect. There are a lot of passes, analyses, and optimizations that would benefit from that.

I don't see why option 2 is necessarily mutually exclusive with option 1. It's reasonable for me to have repeat and repeat(reg, n) (or sim.) syntax coexisting, and identification of loop variable registers could simply be easier when repeat is constructed using the second syntax.

The more things you have in the IR, the more things you have to worry about optimizing so we generally prefer having canonical versions.

I meant that the loop counter variable would be created regardless, but Adrian's syntax could give it a custom name.