Xilinx/mlir-aie

ObjectFifo: "aie.use_lock" op can only access a lock in the same tile

andrej opened this issue · 7 comments

The following code errors:

module {
  aie.device(ipu) {
    
    %tile_0_2 = aie.tile(0, 2)
    %tile_1_2 = aie.tile(1, 2)
    %tile_2_2 = aie.tile(2, 2)

    aie.objectfifo @a(%tile_2_2, {%tile_1_2}, 2 : i32) : !aie.objectfifo<memref<1024xbf16>>
    aie.objectfifo @b(%tile_1_2 toStream [<size = 32, stride = 1>], {%tile_0_2}, 2 : i32) : !aie.objectfifo<memref<32x32xbf16>>

    aie.objectfifo.link [@a] -> [@b]()

  }
}

with

error: unknown: 'aie.use_lock' op can only access a lock in the same tile
 note: unknown: see current operation: "aie.use_lock"(%10) <{acq_en = true, action = 2 : i32, value = 1 : i32}> : (index) -> ()

The error disappears when you remove the toStream from objectfifo b.

This one fails even without toStream:

module {
  aie.device(ipu) {
    %tile_0_2 = aie.tile(0, 2)
    %tile_2_2 = aie.tile(2, 2)
    %tile_3_2 = aie.tile(3, 2)

    aie.objectfifo @a(%tile_3_2, {%tile_2_2}, 2 : i32) : !aie.objectfifo<memref<1024xbf16>>
    aie.objectfifo @b(%tile_2_2, {%tile_0_2}, 2 : i32) : !aie.objectfifo<memref<1024xbf16>>

    aie.objectfifo.link [@a] -> [@b]()

  }
}

(Same error)

I'll double check this in more detail tomorrow but I think it's because you are using LinkOp with a shared tile that is a compute tile not a Mem tile. There's missing checks at that level that should verify whether the shared tile is not sharing memory with the compute tile, which will result in errors as the shared tile's DMA can't access the locks in the compute tile. It should disappear if you use tile (1, 1) in the first example instead of (1, 2). But I'll add actual checks to give you a more meaningful error ASAP

Oh yeah, looks like the error does disappear when I use tile (1, 2). Thank you.

Do you expect that link on a compute tile should work, or have you not really considered this? I only ran into this while testing and it's not a hugely important feature, just curious.

I think it should work because it's a feature that any tile DMA can support as long as that tile has a memory module. However, the Object FIFO lowering looks for opportunities to use shared memory and that conflicts with the lowering for the LinkOp, which requires DMAs.

It's on my immediate TODO list to start by adding a check with better error messages. The next step will then be to support to feature instead of emitting an error.

Another question is whether we allow the core of the shared tile (when compute tiles will be supported by the LinkOp) to access the objects of the Object FIFOs and modify them. That will likely involve some lock gymnastics between the core and the DMA.

Thank you Andra!

To give some context and motivation, my use case for this is that I have a design that is bandwidth-bound. So I need to use more channels to get data in to the compute core, quicker, in parallel.

My idea was to have two objectFifos that come in through the shim, memtile, onto the core, each with half the data. Then, on the core itself, we would have a "join" pattern that puts the data into the same buffer, contiguously.

But I think due to the restrictions you mention it is currently not possible to express this with the objectFifo and link. The best I was able to do was to have the "join" take place on the mem tile, but that does not help me: the channel from the mem tile into the core is the bottleneck. Data would need to come in in parallel on separate channels and be combined on the compute core.

For now, I can just go back to writing my BDs manually -- no problem.