Calyx AXI: implementation lacks conversion between XRT address and Calyx memory addresses

Question

Calyx AXI: implementation lacks conversion between XRT address and Calyx memory addresses

Closed this issue 8 months ago · 2 comments

The current AXI implementation manages to inadvertently convert between byte-addressable base_address addresses and WIDTH-addressable curr_address addresses because we assume the initial base addresses are 0 in our MVP. However I believe the XRT spec requires base addresses to start at 0x1000 (maybe 0x2000? Either way I recall that it is non-zero).

The current implementation has a read-channel curr_addr which indexes into calyx memories like so:

cells {
  ref curr_addr = std_reg(64);
}
// stuff ...
some_group {
  calyx_mem.addr0 = curr_addr.out;
}

And in the main component we have groups that set curr_addr to the value of base_addr:

group set_curr_to_base_addr_A0{
    curr_addr_A0.in = base_addr_A0.out;
    curr_addr_A0.write_en = 1'b1;
    set_curr_to_base_addr_A0[done] = curr_addr_A0.done;
}

where base_addr is related to kernel xml files in a way that I don't fully understand at the moment, and drives xADDR signals:

ARADDR = base_addr.out;

In general the value of base_addr != the value of curr_addr at the start of a transaction. So we need some method of converting between byte addressed base_address and WIDTH addressed curr_addr. . This feels like it may be expensive? I talk about this more in this thread.

It's good to note that for our current load -> compute -> store scheme, we can bypass this issue by having separate counters for our base address and calyx memories. But in the future, if we want to allow for dynamic loads/stores or load more than 256 chunks of data we will need a way to convert between the two.

Answer 1 · 2024-01-18T20:49:42.000Z

Closing this for now as #1854 solves this immediate issue. May be useful to keep this in mind when we work on a wrapper that allows for dynamic AXI requests, but it seems like there is a general way forward where we round up to the next largest power of 2 to avoid complicated mult/div work and instead just shift things around, if necessary.

Answer 2 · 2024-01-22T00:57:31.000Z

Thanks for the clear explanation here! As I'm just catching up, I want to note that there is another route for addressing this specific problem:

So we need some method of converting between byte addressed base_address and WIDTH addressed curr_addr. . This feels like it may be expensive?

To put it in pseudo-C terms, I think what you're saying is that we might need to do something like this:

OUT_TYPE read_value(int curr_addr) {
  int byte_offset = (curr_addr * WIDTH) / 8;  // assuming everything divides nicely
  return axi_read(byte_offset + BASE_ADDR);
}

Is that an accurate reflection? Anyway, it's worth keeping in mind that, if we are actually doing this in a loop to load an entire memory or whatever, like this:

void load_entire_memory(OUT_TYPE *mem) {
  for (int i = 0; i < LENGTH; ++i) {
    mem[i] = read_value(i);
  }
}

Then you can do a kind of "induction variable elimination" and get rid of the * and / altogether, replacing them with += bumps:

int axi_addr = BASE_ADDR;
int byte_width = WIDTH / 8;  // again assuming nice division here
for (int i = 0; i < LENGTH; ++i) {
  axi_addr += byte_width;
  mem[i] = axi_read(axi_addr);
}

I realize this may be a little cryptic, but hopefully close to making sense?