JuliaGPU/Metal.jl

Legalization errors with vectorized code

Closed this issue · 3 comments

The following IR leads to a compilation failure:

target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024-n8:16:32"
target triple = "air64-apple-macosx13.5.2"

define void @kernel() {
top:
  %a = load i64, i64 addrspace(1)* null, align 8
  %b = insertelement <4 x i64> zeroinitializer, i64 %a, i32 0
  %c = icmp uge <4 x i64> zeroinitializer, %b
  %d = bitcast <4 x i1> %c to i4
  %e = icmp eq i4 %d, 0
  br i1 %e, label %L1, label %L2

L2:                                              ; preds = %top
  store i32 0, i32* null, align 4
  unreachable

L1:                                             ; preds = %top
  ret void
}

!llvm.module.flags = !{!0, !1, !2, !3, !4, !5, !6, !7, !8}
!julia.kernel = !{!9}
!air.kernel = !{!10}
!llvm.ident = !{!23}
!air.version = !{!24}
!air.language_version = !{!25}

!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 2, !"Debug Info Version", i32 3}
!2 = !{i32 7, !"air.max_device_buffers", i32 31}
!3 = !{i32 7, !"air.max_constant_buffers", i32 31}
!4 = !{i32 7, !"air.max_threadgroup_buffers", i32 31}
!5 = !{i32 7, !"air.max_textures", i32 128}
!6 = !{i32 7, !"air.max_read_write_textures", i32 8}
!7 = !{i32 7, !"air.max_samplers", i32 16}
!8 = !{i32 2, !"SDK Version", [3 x i32] [i32 13, i32 5, i32 2]}
!9 = !{void ()* @kernel}
!10 = !{void ()* @kernel, !11, !12}
!11 = !{}
!12 = !{}
!23 = !{!"Julia 1.10.0-beta2 with Metal.jl"}
!24 = !{i32 3, i32 0, i32 0}
!25 = !{!"Metal", i32 3, i32 0, i32 0}
Error Domain=AGXMetalG13X Code=3 "Compiler encountered an internal error" UserInfo={NSLocalizedDescription=Compiler encountered an internal error}

Console log:

LLVM ERROR: unable to legalize instruction: %82:_(s4) = 67 %78:_(<4 x s1>)
Context:
%82:_(s4) = 67 %78:_(<4 x s1>)
%78:_(<4 x s1>) = 62 %77:_(s1), %81:_(s1), %81:_(s1), %81:_(s1)
%77:_(s1) = 55 %76:_(s1), %bb.2, %81:_(s1), %bb.3, %81:_(s1), %bb.4
%81:_(s1) = 95 i1 true
%76:_(s1) = 105 intpred(eq), %75:_(s64), %54:_
%75:_(s64) = 70 %73:_(p1) :: (load 8 from %ir.9, align 536870912, addrspace 1)
%54:_(s64) = 95 i64 0
%73:_(p1) = 70 %3:_(p64) :: (load 8 from `i64 addrspace(1)* addrspace(64)* bitcast ([12 x i8] addrspace(64)* @memorycache1 to i64 addrspace(1)* addrspace(64)*)`, addrspace 64)
%3:_(p64) = 57 @memorycache1
 (in function: agc.main)

Worked around this by disabling the vectorization pipeline in JuliaGPU/GPUCompiler.jl#516.

Let's consider this fixed.