roastduck/FreeTensor

Schedule separate_tail incorrectly hoists branch with modified dependency

Blealtan opened this issue · 1 comments

import ir

@ir.transform
def foo(x):
    ir.declare_var(x, (32,), 'int32', 'output', 'cpu')
    a = ir.create_var((), 'int32', 'cpu')
    a[()] = 10
    for i in range(32):
        a[()] = 20
        if i < a[()]:
            x[i] = i
        else:
            x[i] = 32 - i

s = ir.Schedule(foo)
s.separate_tail()
print(s.func())

The above code yields the following output:

→ python3 playground.py 
[WARING] ../src/pass/detail/simplify.h:721: SimplifyPass iterates over 100 rounds. Maybe there is a bug
[WARING] ../src/pass/detail/simplify.h:721: SimplifyPass iterates over 100 rounds. Maybe there is a bug
func(x) {
x:
  [out] [CPU] x: i32[32] {
a:
    [cache] [CPU] a: i32[] {
      a[] = 10
      if ((a[] >= 0) && (a[] <= 32)) {
        for i in 0 : a[] : 1 {
          a[] = 20
          x[i] = i
        }
        for i in a[] : 32 : 1 {
          a[] = 20
          x[i] = ((-1 * i) + 32)
        }
      }
      else {
        for i in 0 : 32 : 1 {
          a[] = 20
          if (i < a[]) {
            x[i] = i
          }
          else {
            x[i] = ((-1 * i) + 32)
          }
        }
      }
    }
  }
}

The first case, which will take after transformation, is definitely inconsistent with the original code.
The warning in simplification is also suspicious as the input code is simple enough, and the if ((a[] >= 0) && (a[] <= 32)) branch is not eliminated after scheduling.

Fixed by no longer separate the loops with respect to any conditions that reads an variable which is written inside the loops.

This criteria might be kind of relaxed. If you want a more strict criteria in the future, I can run analyze/find_loop_variant instead.