[MLIR Limitation] LLVM backend only supports modulo on <=128-bit integers
zzzDavid opened this issue · 2 comments
zzzDavid commented
This thread documents an LLVM limitation of the modulo operation. The llvm.remsi
operation does not support integer wider than 128 bits.
Test case:
def test_mod_ui():
hcl.init()
def kernel():
x = hcl.scalar(19, "x", dtype="uint128")
y = hcl.scalar(5, "y", dtype="uint128")
z = hcl.scalar(0, "z", dtype="uint128")
z.v = (x.v + 5 - y.v) % z.v
s = hcl.create_schedule([], kernel)
print(hcl.lower(s))
f = hcl.build(s)
f()
Error message:
Compiling with translate -> llc -> gcc
undefined reference to `__umodei4'
JIT compilation:
JIT session error: Symbols not found: [ __umodei4 ]
Expected<T> must be checked before access or destruction.
Unchecked Expected<T> contained error:
Failed to materialize symbols: { (main, { top, _mlir_top, _mlir__mlir_ciface_top, _mlir_ciface_top }) }------------------------------------------------------------------------
zzzDavid commented
The same issue was raised before on the Halide-based HeteroCL, although the error message looks different:
zzzDavid commented
Bypassed by commit: cornell-zhang/heterocl@6a3015b
The frontend raises a warning on modulo operation on integer > 128 bits, and the bitwidth is capped at 128 bits.
Output of the above example:
[Data Type] Modulo only supports integer <= 128 bits
warnings.warn(self.message, category=self.category)
module {
func.func @top() attributes {itypes = "", otypes = ""} {
%c0 = arith.constant 0 : index
%0 = memref.alloc() {name = "x"} : memref<1xi128>
%c19_i32 = arith.constant 19 : i32
%1 = arith.extsi %c19_i32 {unsigned} : i32 to i128
affine.store %1, %0[%c0] {to = "x", unsigned} : memref<1xi128>
%2 = memref.alloc() {name = "y"} : memref<1xi128>
%c5_i32 = arith.constant 5 : i32
%3 = arith.extsi %c5_i32 {unsigned} : i32 to i128
affine.store %3, %2[%c0] {to = "y", unsigned} : memref<1xi128>
%4 = memref.alloc() {name = "z"} : memref<1xi128>
%c0_i32 = arith.constant 0 : i32
%5 = arith.extsi %c0_i32 {unsigned} : i32 to i128
affine.store %5, %4[%c0] {to = "z", unsigned} : memref<1xi128>
%6 = affine.load %0[0] {from = "x", unsigned} : memref<1xi128>
%7 = arith.extui %6 : i128 to i130
%8 = arith.extsi %c5_i32 : i32 to i130
%9 = arith.addi %7, %8 : i130
%10 = affine.load %2[0] {from = "y", unsigned} : memref<1xi128>
%11 = arith.extsi %9 : i130 to i131
%12 = arith.extui %10 : i128 to i131
%13 = arith.subi %11, %12 : i131
%14 = affine.load %4[0] {from = "z", unsigned} : memref<1xi128>
%15 = arith.trunci %13 : i131 to i128
%16 = arith.remsi %15, %14 : i128
affine.store %16, %4[0] {to = "z", unsigned} : memref<1xi128>
return
}
}