nod-ai/iree-amd-aie

Add e2e support for pack pipeline

nirvedhmeshram opened this issue · 2 comments

Based on recent discussions the pad-pack pipeline has allowed us to make progress but the long term solution seems to be that we want to further build the pack pipeline which is currently not e2e enabled. I used the patch below to test where it is today.

--- a/compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Passes.cpp
+++ b/compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Passes.cpp
@@ -283,13 +283,8 @@ void buildAMDAIETransformPassPipeline(OpPassManager &pm) {
     pm.addPass(createAMDAIELowerExecutableTargetPass(options));
   }
   pm.addPass(createAMDAIELowerWorkgroupCountPass());
-  if (clUsePipeline == AIEPassPipeline::PadPackPipeline) {
-    auto &modulePassManager = pm.nest<ModuleOp>();
-    addMLIRAIRAIELoweringPasses(modulePassManager);
-  } else if (clUsePipeline != AIEPassPipeline::PackPipeline) {
-    auto &modulePassManager = pm.nest<ModuleOp>();
-    addMLIRAIRAIELegacyLoweringPasses(modulePassManager);
-  }
+  auto &modulePassManager = pm.nest<ModuleOp>();
+  addMLIRAIRAIELoweringPasses(modulePassManager);

For a matmul with shape M.N,K all 64 it seems to be crashing in the air-to-aie lowering. Here is the IR dump with the crash dump in the end. @erwei-xilinx any thoughts why it would crash there?
https://gist.github.com/nirvedhmeshram/25cf5426efea6cb048c06881ee801b60

The IR should probably have a fix mentioned here
#247 (review)
However, I think the crash mentioned in the issue is independent of this.

@nirvedhmeshram , the pack pipeline is generating memref.alloc and dealloc ops outside of all scf.parallel loops, which conflicts with MLIR-AIR's assumption that allocs are generated next to the first copy (or pack) operation using its ssa value. Or to put it another way, AIR expects L2 memalloc to be at least within outer scf.parallel and L1 memalloc within inner scf.parallel, where their usage are scoped within those regions. Please see how pad-pack pipeline generates allocs, as a reference.

This is the only changes needed to make it work with MLIR-AIR. Here is my dev setup showing the IR working with MLIR-AIR: https://gist.github.com/erwei-xilinx/41b71f269b2e08754360725ebfd758a1

Tested with MLIR-AIR head (0ea5703ed03a896a1649de72440228ceffa81f86).