NVIDIA/TensorRT-Incubator

infinite loop in TransposeEliminationPass with dynamic input

Closed this issue · 2 comments

Following IR causes infinite loop in the TransposeEliminationPass

#map = affine_map<(d0, d1, d2) -> (d0, d2, d1)>
module {
  func.func @test(%arg0: tensor<?x80x80xf32>) -> tensor<?x80x80xf32> {
    %cst_f32 = tensorrt.constant dense<1.000000e+00> : tensor<1x1x1xf32>
    %1 = tensorrt.transpose {permutation = #map} %arg0 : tensor<?x80x80xf32> to tensor<?x80x80xf32>
    %2 = tensorrt.element_wise <kSUB>(%cst_f32, %1 : tensor<1x1x1xf32>, tensor<?x80x80xf32>) -> tensor<?x80x80xf32>
    return %2 : tensor<?x80x80xf32>
  }
}

More details: I think the issue comes from PushdownTransposeEwise patten (may exist in other similar patterns as well). Could be something wrong with pushDownTransposePrecondition when calculating memoryCost with dynamic inputs?

Thanks for reporting this. It looks like guess is correct, it doesn't handle dynamic shapes correctly. I've tested a simple fix internally and should have it sync'd up here by end of week.

Fixed in #371