ReshapedArray indexing broken because of Int128 operation
Closed this issue ยท 13 comments
Hi
I am running the following code and am finding a internal compiler error
using Metal
elt = Float32
dev = Metal.mtl
x = @view reshape(dev(randn(elt, 8, 8))', 64)[1:8]
@allowscalar y = copy(x)
Compilation to native code failed; see below for details.
If you think this is a bug, please file an issue and attach /tmp/jl_y9AnAdkNGZ.metallib
I have the temp file available but cannot attach it to the github issue
Thanks!
What type is elt
?
@christiangnrd elt = Float32
sorry forgot to add that definition. Thanks!
I have the temp file available but cannot attach it to the github issue
You probably have to zip
it.
Also, which version of Metal.jl are you using? Please ensure you're trying v1.1.0.
@maleadt sorry I didn't provide adequate versioning information. I am using Metal v 1.1.0. but I did not have an issue with this code in the previous release of Metal. Here is my versioninfo
Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: macOS (arm64-apple-darwin22.4.0)
CPU: 10 ร Apple M1 Max
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
Thanks!
metal_error.zip
; ModuleID = 'shader.air'
source_filename = "start"
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024-n8:16:32"
target triple = "air64-apple-macosx14.4.1"
; Function Attrs: cold noreturn nounwind
declare void @llvm.trap() #0
declare i64 @air.abs.s.i64(i64) local_unnamed_addr
define internal fastcc void @gpu_report_exception() unnamed_addr !dbg !58 {
top:
ret void, !dbg !61
}
define internal fastcc void @gpu_signal_exception() unnamed_addr !dbg !62 {
top:
ret void, !dbg !64
}
; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
declare i64 @llvm.smax.i64(i64, i64) #1
; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
declare i8 @llvm.umin.i8(i8, i8) #1
define void @_Z15getindex_kernel16mtlKernelContext14MtlDeviceArrayI7Float32Li1ELi1EE13ReshapedArrayIS1_Li1E7AdjointIS1_S0_IS1_Li2ELi1EEE5TupleI27SignedMultiplicativeInverseI5Int64EEES4_IS6_E9UnitRangeIS6_E({ i8 addrspace(1)*, [1 x i64] } addrspace(1)* %0, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)* %1, [1 x i64] addrspace(1)* %2, [2 x i64] addrspace(1)* %3, i32 %threads_per_grid, i32 %thread_position_in_grid) local_unnamed_addr !dbg !65 {
conversion:
%4 = bitcast { i8 addrspace(1)*, [1 x i64] } addrspace(1)* %0 to float addrspace(1)* addrspace(1)*
%.unpack12 = load float addrspace(1)*, float addrspace(1)* addrspace(1)* %4, align 8
%5 = getelementptr inbounds { i8 addrspace(1)*, [1 x i64] }, { i8 addrspace(1)*, [1 x i64] } addrspace(1)* %0, i64 0, i32 1, i64 0
%.unpack10.unpack = load i64, i64 addrspace(1)* %5, align 8
%6 = bitcast { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)* %1 to float addrspace(1)* addrspace(1)*
%.unpack.unpack.unpack26 = load float addrspace(1)*, float addrspace(1)* addrspace(1)* %6, align 8
%.unpack.unpack.unpack19.elt = getelementptr inbounds { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] }, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)* %1, i64 0, i32 0, i64 0, i32 1, i64 0
%.unpack.unpack.unpack19.unpack = load i64, i64 addrspace(1)* %.unpack.unpack.unpack19.elt, align 8
%7 = getelementptr inbounds { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] }, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)* %1, i64 0, i32 2, i64 0
%.unpack16.unpack = load { i64, i64, i8, i8 }, { i64, i64, i8, i8 } addrspace(1)* %7, align 8
%.fca.2.0.0.extract = extractvalue { i64, i64, i8, i8 } %.unpack16.unpack, 0
%.fca.2.0.1.extract = extractvalue { i64, i64, i8, i8 } %.unpack16.unpack, 1
%.fca.2.0.2.extract = extractvalue { i64, i64, i8, i8 } %.unpack16.unpack, 2
%.fca.2.0.3.extract = extractvalue { i64, i64, i8, i8 } %.unpack16.unpack, 3
%8 = add i32 %thread_position_in_grid, 1, !dbg !67
%9 = zext i32 %8 to i64, !dbg !84
%.not = icmp ne i32 %8, 0, !dbg !95
%10 = icmp sge i64 %.unpack10.unpack, %9, !dbg !97
%narrow = select i1 %.not, i1 %10, i1 false, !dbg !97
br i1 %narrow, label %L20, label %common.ret, !dbg !97
common.ret: ; preds = %L87, %conversion
ret void, !dbg !98
L20: ; preds = %conversion
%.elt = getelementptr inbounds [2 x i64], [2 x i64] addrspace(1)* %3, i64 0, i64 0
%.unpack = load i64, i64 addrspace(1)* %.elt, align 8
%11 = getelementptr inbounds { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] }, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)* %1, i64 0, i32 1, i64 0
%.unpack14.unpack = load i64, i64 addrspace(1)* %11, align 8
%12 = add nsw i64 %9, -1, !dbg !99
%13 = add i64 %12, %.unpack, !dbg !105
%14 = call i64 @air.max.s.i64(i64 %.unpack14.unpack, i64 0), !dbg !106
%15 = add i64 %13, -1, !dbg !134
%.not5 = icmp ult i64 %15, %14, !dbg !137
br i1 %.not5, label %L87, label %L84, !dbg !129
L84: ; preds = %L20
call fastcc void @gpu_report_exception(), !dbg !139
call fastcc void @gpu_signal_exception(), !dbg !139
call void @llvm.trap(), !dbg !139
unreachable, !dbg !139
L87: ; preds = %L20
%16 = sext i64 %15 to i128, !dbg !143
%17 = sext i64 %.fca.2.0.1.extract to i128, !dbg !165
%18 = mul nsw i128 %17, %16, !dbg !168
%19 = lshr i128 %18, 64, !dbg !170
%20 = trunc i128 %19 to i64, !dbg !173
%21 = sext i8 %.fca.2.0.2.extract to i64, !dbg !174
%22 = mul i64 %15, %21, !dbg !177
%23 = add i64 %22, %20, !dbg !179
%24 = call i64 @air.abs.s.i64(i64 %.fca.2.0.0.extract), !dbg !180
%.not7 = icmp eq i64 %24, 1, !dbg !184
%25 = mul i64 %.fca.2.0.0.extract, %15, !dbg !186
%26 = call i8 @air.min.u.i8(i8 %.fca.2.0.3.extract, i8 63), !dbg !187
%.v = zext i8 %26 to i64, !dbg !187
%27 = ashr i64 %23, %.v, !dbg !187
%.lobit = lshr i64 %23, 63, !dbg !189
%28 = add i64 %27, %.lobit, !dbg !194
%29 = select i1 %.not7, i64 %25, i64 %28, !dbg !196
%30 = mul i64 %29, %.fca.2.0.0.extract, !dbg !197
%31 = sub i64 %15, %30, !dbg !199
%32 = call i64 @air.max.s.i64(i64 %.unpack.unpack.unpack19.unpack, i64 0), !dbg !200
%33 = mul i64 %31, %32, !dbg !220
%34 = add i64 %33, %29, !dbg !225
%35 = getelementptr inbounds float, float addrspace(1)* %.unpack.unpack.unpack26, i64 %34, !dbg !226
%36 = load float, float addrspace(1)* %35, align 4, !dbg !226, !tbaa !240
%37 = getelementptr inbounds float, float addrspace(1)* %.unpack12, i64 %12, !dbg !243
store float %36, float addrspace(1)* %37, align 4, !dbg !243, !tbaa !240
br label %common.ret
}
declare i64 @air.max.s.i64(i64, i64)
declare i8 @air.min.u.i8(i8, i8)
attributes #0 = { cold noreturn nounwind }
attributes #1 = { nocallback nofree nosync nounwind readnone speculatable willreturn }
!llvm.module.flags = !{!0, !1, !2, !3, !4, !5, !6, !7, !8}
!llvm.dbg.cu = !{!9, !11, !12, !13, !14, !15, !16, !17, !18, !19, !20, !21, !22, !23, !24, !25, !26, !27, !28, !29, !30, !31, !32, !33, !34, !35, !36, !37, !38, !39, !40, !41, !42, !43, !44}
!julia.kernel = !{!45}
!air.kernel = !{!46}
!llvm.ident = !{!55}
!air.version = !{!56}
!air.language_version = !{!57}
!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 2, !"Debug Info Version", i32 3}
!2 = !{i32 7, !"air.max_device_buffers", i32 31}
!3 = !{i32 7, !"air.max_constant_buffers", i32 31}
!4 = !{i32 7, !"air.max_threadgroup_buffers", i32 31}
!5 = !{i32 7, !"air.max_textures", i32 128}
!6 = !{i32 7, !"air.max_read_write_textures", i32 8}
!7 = !{i32 7, !"air.max_samplers", i32 16}
!8 = !{i32 2, !"SDK Version", [3 x i32] [i32 14, i32 4, i32 1]}
!9 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!10 = !DIFile(filename: "julia", directory: ".")
!11 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!12 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!13 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!14 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!15 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!16 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!17 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!18 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!19 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!20 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!21 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!22 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!23 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!24 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!25 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!26 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!27 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!28 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!29 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!30 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!31 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!32 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!33 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!34 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!35 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!36 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!37 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!38 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!39 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!40 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!41 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!42 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!43 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!44 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly)
!45 = !{void ({ i8 addrspace(1)*, [1 x i64] } addrspace(1)*, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)*, [1 x i64] addrspace(1)*, [2 x i64] addrspace(1)*, i32, i32)* @_Z15getindex_kernel16mtlKernelContext14MtlDeviceArrayI7Float32Li1ELi1EE13ReshapedArrayIS1_Li1E7AdjointIS1_S0_IS1_Li2ELi1EEE5TupleI27SignedMultiplicativeInverseI5Int64EEES4_IS6_E9UnitRangeIS6_E}
!46 = !{void ({ i8 addrspace(1)*, [1 x i64] } addrspace(1)*, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)*, [1 x i64] addrspace(1)*, [2 x i64] addrspace(1)*, i32, i32)* @_Z15getindex_kernel16mtlKernelContext14MtlDeviceArrayI7Float32Li1ELi1EE13ReshapedArrayIS1_Li1E7AdjointIS1_S0_IS1_Li2ELi1EEE5TupleI27SignedMultiplicativeInverseI5Int64EEES4_IS6_E9UnitRangeIS6_E, !47, !48}
!47 = !{}
!48 = !{!49, !50, !51, !52, !53, !54}
!49 = !{i32 0, !"air.buffer", !"air.location_index", i32 0, i32 1, !"air.read_write", !"air.address_space", i32 1, !"air.arg_type_size", i32 16, !"air.arg_type_align_size", i32 8, !"air.arg_type_name", !"MtlDeviceVector{Float32, 1}", !"air.arg_name", !"dest"}
!50 = !{i32 1, !"air.buffer", !"air.location_index", i32 1, i32 1, !"air.read_write", !"air.address_space", i32 1, !"air.arg_type_size", i32 56, !"air.arg_type_align_size", i32 8, !"air.arg_type_name", !"Base.ReshapedArray{Float32, 1, LinearAlgebra.Adjoint{Float32, MtlDeviceMatrix{Float32, 1}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}", !"air.arg_name", !"src"}
!51 = !{i32 2, !"air.buffer", !"air.location_index", i32 2, i32 1, !"air.read_write", !"air.address_space", i32 1, !"air.arg_type_size", i32 8, !"air.arg_type_align_size", i32 8, !"air.arg_type_name", !"Tuple{Int64}", !"air.arg_name", !"idims"}
!52 = !{i32 3, !"air.buffer", !"air.location_index", i32 3, i32 1, !"air.read_write", !"air.address_space", i32 1, !"air.arg_type_size", i32 16, !"air.arg_type_align_size", i32 8, !"air.arg_type_name", !"UnitRange{Int64}", !"air.arg_name", !"Is"}
!53 = !{i32 4, !"air.threads_per_grid", !"air.arg_type_name", !"uint"}
!54 = !{i32 5, !"air.thread_position_in_grid", !"air.arg_type_name", !"uint"}
!55 = !{!"Julia 1.10.2 with Metal.jl"}
!56 = !{i32 2, i32 5, i32 0}
!57 = !{!"Metal", i32 3, i32 1, i32 0}
!58 = distinct !DISubprogram(name: "report_exception", linkageName: "julia_report_exception_3328", scope: null, file: !59, line: 13, type: !60, scopeLine: 13, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !20, retainedNodes: !47)
!59 = !DIFile(filename: "/Users/kpierce/.julia/packages/Metal/q9oGt/src/device/runtime.jl", directory: ".")
!60 = !DISubroutineType(cc: DW_CC_nocall, types: !47)
!61 = !DILocation(line: 18, scope: !58)
!62 = distinct !DISubprogram(name: "signal_exception", linkageName: "julia_signal_exception_3349", scope: null, file: !59, line: 9, type: !63, scopeLine: 9, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !28, retainedNodes: !47)
!63 = !DISubroutineType(types: !47)
!64 = !DILocation(line: 10, scope: !62)
!65 = distinct !DISubprogram(name: "getindex_kernel", linkageName: "julia_getindex_kernel_4165", scope: null, file: !66, line: 82, type: !63, scopeLine: 82, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!66 = !DIFile(filename: "/Users/kpierce/.julia/packages/GPUArrays/OKkAu/src/host/indexing.jl", directory: ".")
!67 = !DILocation(line: 87, scope: !68, inlinedAt: !70)
!68 = distinct !DISubprogram(name: "+;", linkageName: "+", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!69 = !DIFile(filename: "int.jl", directory: ".")
!70 = !DILocation(line: 49, scope: !71, inlinedAt: !73)
!71 = distinct !DISubprogram(name: "#thread_position_in_grid_1d;", linkageName: "#thread_position_in_grid_1d", scope: !72, file: !72, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!72 = !DIFile(filename: "/Users/kpierce/.julia/packages/Metal/q9oGt/src/device/intrinsics/arguments.jl", directory: ".")
!73 = !DILocation(line: 36, scope: !74, inlinedAt: !76)
!74 = distinct !DISubprogram(name: "global_index;", linkageName: "global_index", scope: !75, file: !75, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!75 = !DIFile(filename: "/Users/kpierce/.julia/packages/Metal/q9oGt/src/gpuarrays.jl", directory: ".")
!76 = !DILocation(line: 44, scope: !77, inlinedAt: !79)
!77 = distinct !DISubprogram(name: "linear_index;", linkageName: "linear_index", scope: !78, file: !78, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!78 = !DIFile(filename: "/Users/kpierce/.julia/packages/GPUArrays/OKkAu/src/device/indexing.jl", directory: ".")
!79 = !DILocation(line: 66, scope: !80, inlinedAt: !81)
!80 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !78, file: !78, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!81 = !DILocation(line: 85, scope: !82, inlinedAt: !83)
!82 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !66, file: !66, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!83 = !DILocation(line: 82, scope: !65)
!84 = !DILocation(line: 708, scope: !85, inlinedAt: !87)
!85 = distinct !DISubprogram(name: "toInt64;", linkageName: "toInt64", scope: !86, file: !86, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!86 = !DIFile(filename: "boot.jl", directory: ".")
!87 = !DILocation(line: 784, scope: !88, inlinedAt: !89)
!88 = distinct !DISubprogram(name: "Int64;", linkageName: "Int64", scope: !86, file: !86, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!89 = !DILocation(line: 7, scope: !90, inlinedAt: !92)
!90 = distinct !DISubprogram(name: "convert;", linkageName: "convert", scope: !91, file: !91, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!91 = !DIFile(filename: "number.jl", directory: ".")
!92 = !DILocation(line: 551, scope: !93, inlinedAt: !94)
!93 = distinct !DISubprogram(name: "rem;", linkageName: "rem", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!94 = !DILocation(line: 1066, scope: !68, inlinedAt: !76)
!95 = !DILocation(line: 514, scope: !96, inlinedAt: !97)
!96 = distinct !DISubprogram(name: "<=;", linkageName: "<=", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!97 = !DILocation(line: 67, scope: !80, inlinedAt: !81)
!98 = !DILocation(line: 0, scope: !82, inlinedAt: !83)
!99 = !DILocation(line: 86, scope: !100, inlinedAt: !101)
!100 = distinct !DISubprogram(name: "-;", linkageName: "-", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!101 = !DILocation(line: 929, scope: !102, inlinedAt: !104)
!102 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !103, file: !103, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!103 = !DIFile(filename: "range.jl", directory: ".")
!104 = !DILocation(line: 87, scope: !82, inlinedAt: !83)
!105 = !DILocation(line: 87, scope: !68, inlinedAt: !101)
!106 = !DILocation(line: 647, scope: !107, inlinedAt: !109)
!107 = distinct !DISubprogram(name: "ifelse;", linkageName: "ifelse", scope: !108, file: !108, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!108 = !DIFile(filename: "essentials.jl", directory: ".")
!109 = !DILocation(line: 532, scope: !110, inlinedAt: !112)
!110 = distinct !DISubprogram(name: "max;", linkageName: "max", scope: !111, file: !111, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!111 = !DIFile(filename: "promotion.jl", directory: ".")
!112 = !DILocation(line: 454, scope: !113, inlinedAt: !114)
!113 = distinct !DISubprogram(name: "OneTo;", linkageName: "OneTo", scope: !103, file: !103, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!114 = !DILocation(line: 467, scope: !113, inlinedAt: !115)
!115 = !DILocation(line: 469, scope: !116, inlinedAt: !117)
!116 = distinct !DISubprogram(name: "oneto;", linkageName: "oneto", scope: !103, file: !103, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!117 = !DILocation(line: 291, scope: !118, inlinedAt: !120)
!118 = distinct !DISubprogram(name: "map;", linkageName: "map", scope: !119, file: !119, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!119 = !DIFile(filename: "tuple.jl", directory: ".")
!120 = !DILocation(line: 98, scope: !121, inlinedAt: !123)
!121 = distinct !DISubprogram(name: "axes;", linkageName: "axes", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!122 = !DIFile(filename: "abstractarray.jl", directory: ".")
!123 = !DILocation(line: 137, scope: !124, inlinedAt: !125)
!124 = distinct !DISubprogram(name: "axes1;", linkageName: "axes1", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!125 = !DILocation(line: 389, scope: !126, inlinedAt: !127)
!126 = distinct !DISubprogram(name: "eachindex;", linkageName: "eachindex", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!127 = !DILocation(line: 687, scope: !128, inlinedAt: !129)
!128 = distinct !DISubprogram(name: "checkbounds;", linkageName: "checkbounds", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!129 = !DILocation(line: 702, scope: !128, inlinedAt: !130)
!130 = !DILocation(line: 248, scope: !131, inlinedAt: !133)
!131 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !132, file: !132, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!132 = !DIFile(filename: "reshapedarray.jl", directory: ".")
!133 = !DILocation(line: 88, scope: !82, inlinedAt: !83)
!134 = !DILocation(line: 86, scope: !100, inlinedAt: !135)
!135 = !DILocation(line: 763, scope: !136, inlinedAt: !127)
!136 = distinct !DISubprogram(name: "checkindex;", linkageName: "checkindex", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!137 = !DILocation(line: 513, scope: !138, inlinedAt: !135)
!138 = distinct !DISubprogram(name: "<;", linkageName: "<", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!139 = !DILocation(line: 4, scope: !140, inlinedAt: !142)
!140 = distinct !DISubprogram(name: "#throw_boundserror", linkageName: "julia_#throw_boundserror_4181", scope: null, file: !141, line: 33, type: !60, scopeLine: 33, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !14, retainedNodes: !47)
!141 = !DIFile(filename: "/Users/kpierce/.julia/packages/Metal/q9oGt/src/device/quirks.jl", directory: ".")
!142 = distinct !DILocation(line: 702, scope: !128, inlinedAt: !130)
!143 = !DILocation(line: 715, scope: !144, inlinedAt: !145)
!144 = distinct !DISubprogram(name: "toInt128;", linkageName: "toInt128", scope: !86, file: !86, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!145 = !DILocation(line: 785, scope: !146, inlinedAt: !147)
!146 = distinct !DISubprogram(name: "Int128;", linkageName: "Int128", scope: !86, file: !86, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!147 = !DILocation(line: 7, scope: !90, inlinedAt: !148)
!148 = !DILocation(line: 891, scope: !149, inlinedAt: !151)
!149 = distinct !DISubprogram(name: "widen;", linkageName: "widen", scope: !150, file: !150, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!150 = !DIFile(filename: "operators.jl", directory: ".")
!151 = !DILocation(line: 139, scope: !152, inlinedAt: !154)
!152 = distinct !DISubprogram(name: "_mul_high;", linkageName: "_mul_high", scope: !153, file: !153, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!153 = !DIFile(filename: "multinverses.jl", directory: ".")
!154 = !DILocation(line: 158, scope: !155, inlinedAt: !156)
!155 = distinct !DISubprogram(name: "div;", linkageName: "div", scope: !153, file: !153, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!156 = !DILocation(line: 172, scope: !157, inlinedAt: !158)
!157 = distinct !DISubprogram(name: "divrem;", linkageName: "divrem", scope: !153, file: !153, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!158 = !DILocation(line: 223, scope: !159, inlinedAt: !160)
!159 = distinct !DISubprogram(name: "_ind2sub_rs;", linkageName: "_ind2sub_rs", scope: !132, file: !132, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!160 = !DILocation(line: 220, scope: !161, inlinedAt: !162)
!161 = distinct !DISubprogram(name: "ind2sub_rs;", linkageName: "ind2sub_rs", scope: !132, file: !132, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!162 = !DILocation(line: 260, scope: !163, inlinedAt: !164)
!163 = distinct !DISubprogram(name: "_unsafe_getindex;", linkageName: "_unsafe_getindex", scope: !132, file: !132, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!164 = !DILocation(line: 249, scope: !131, inlinedAt: !133)
!165 = !DILocation(line: 549, scope: !93, inlinedAt: !166)
!166 = !DILocation(line: 1066, scope: !167, inlinedAt: !151)
!167 = distinct !DISubprogram(name: "*;", linkageName: "*", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!168 = !DILocation(line: 1053, scope: !167, inlinedAt: !169)
!169 = !DILocation(line: 1068, scope: !167, inlinedAt: !151)
!170 = !DILocation(line: 530, scope: !171, inlinedAt: !172)
!171 = distinct !DISubprogram(name: ">>>;", linkageName: ">>>", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!172 = !DILocation(line: 538, scope: !171, inlinedAt: !151)
!173 = !DILocation(line: 544, scope: !93, inlinedAt: !151)
!174 = !DILocation(line: 549, scope: !93, inlinedAt: !175)
!175 = !DILocation(line: 1066, scope: !167, inlinedAt: !176)
!176 = !DILocation(line: 159, scope: !155, inlinedAt: !156)
!177 = !DILocation(line: 88, scope: !167, inlinedAt: !178)
!178 = !DILocation(line: 1068, scope: !167, inlinedAt: !176)
!179 = !DILocation(line: 87, scope: !68, inlinedAt: !176)
!180 = !DILocation(line: 302, scope: !181, inlinedAt: !183)
!181 = distinct !DISubprogram(name: "#abs;", linkageName: "#abs", scope: !182, file: !182, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!182 = !DIFile(filename: "/Users/kpierce/.julia/packages/Metal/q9oGt/src/device/intrinsics/math.jl", directory: ".")
!183 = !DILocation(line: 160, scope: !155, inlinedAt: !156)
!184 = !DILocation(line: 521, scope: !185, inlinedAt: !183)
!185 = distinct !DISubprogram(name: "==;", linkageName: "==", scope: !111, file: !111, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!186 = !DILocation(line: 88, scope: !167, inlinedAt: !183)
!187 = !DILocation(line: 527, scope: !188, inlinedAt: !183)
!188 = distinct !DISubprogram(name: ">>;", linkageName: ">>", scope: !69, file: !69, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!189 = !DILocation(line: 711, scope: !85, inlinedAt: !190)
!190 = !DILocation(line: 784, scope: !88, inlinedAt: !191)
!191 = !DILocation(line: 7, scope: !90, inlinedAt: !192)
!192 = !DILocation(line: 546, scope: !93, inlinedAt: !193)
!193 = !DILocation(line: 1066, scope: !68, inlinedAt: !183)
!194 = !DILocation(line: 87, scope: !68, inlinedAt: !195)
!195 = !DILocation(line: 1068, scope: !68, inlinedAt: !183)
!196 = !DILocation(line: 647, scope: !107, inlinedAt: !183)
!197 = !DILocation(line: 88, scope: !167, inlinedAt: !198)
!198 = !DILocation(line: 173, scope: !157, inlinedAt: !158)
!199 = !DILocation(line: 86, scope: !100, inlinedAt: !198)
!200 = !DILocation(line: 647, scope: !107, inlinedAt: !201)
!201 = !DILocation(line: 532, scope: !110, inlinedAt: !202)
!202 = !DILocation(line: 454, scope: !113, inlinedAt: !203)
!203 = !DILocation(line: 467, scope: !113, inlinedAt: !204)
!204 = !DILocation(line: 469, scope: !116, inlinedAt: !205)
!205 = !DILocation(line: 292, scope: !118, inlinedAt: !206)
!206 = !DILocation(line: 98, scope: !121, inlinedAt: !207)
!207 = !DILocation(line: 2957, scope: !208, inlinedAt: !209)
!208 = distinct !DISubprogram(name: "_sub2ind;", linkageName: "_sub2ind", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!209 = !DILocation(line: 1330, scope: !210, inlinedAt: !211)
!210 = distinct !DISubprogram(name: "_to_linear_index;", linkageName: "_to_linear_index", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!211 = !DILocation(line: 114, scope: !212, inlinedAt: !214)
!212 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !213, file: !213, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!213 = !DIFile(filename: "/Users/kpierce/.julia/packages/Metal/q9oGt/src/device/array.jl", directory: ".")
!214 = !DILocation(line: 329, scope: !215, inlinedAt: !217)
!215 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !216, file: !216, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!216 = !DIFile(filename: "/Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/LinearAlgebra/src/adjtrans.jl", directory: ".")
!217 = !DILocation(line: 264, scope: !218, inlinedAt: !219)
!218 = distinct !DISubprogram(name: "_unsafe_getindex_rs;", linkageName: "_unsafe_getindex_rs", scope: !132, file: !132, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!219 = !DILocation(line: 261, scope: !163, inlinedAt: !164)
!220 = !DILocation(line: 88, scope: !167, inlinedAt: !221)
!221 = !DILocation(line: 2989, scope: !222, inlinedAt: !223)
!222 = distinct !DISubprogram(name: "_sub2ind_recurse;", linkageName: "_sub2ind_recurse", scope: !122, file: !122, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!223 = !DILocation(line: 2989, scope: !222, inlinedAt: !224)
!224 = !DILocation(line: 2973, scope: !208, inlinedAt: !207)
!225 = !DILocation(line: 86, scope: !100, inlinedAt: !226)
!226 = !DILocation(line: 38, scope: !227, inlinedAt: !229)
!227 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !228, file: !228, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!228 = !DIFile(filename: "/Users/kpierce/.julia/packages/LLVM/bzSzE/src/interop/base.jl", directory: ".")
!229 = !DILocation(line: 0, scope: !230, inlinedAt: !232)
!230 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !231, file: !231, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!231 = !DIFile(filename: "none", directory: ".")
!232 = !DILocation(line: 0, scope: !233, inlinedAt: !234)
!233 = distinct !DISubprogram(name: "pointerref;", linkageName: "pointerref", scope: !231, file: !231, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!234 = !DILocation(line: 85, scope: !235, inlinedAt: !237)
!235 = distinct !DISubprogram(name: "unsafe_load;", linkageName: "unsafe_load", scope: !236, file: !236, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!236 = !DIFile(filename: "/Users/kpierce/.julia/packages/LLVM/bzSzE/src/interop/pointer.jl", directory: ".")
!237 = !DILocation(line: 82, scope: !238, inlinedAt: !239)
!238 = distinct !DISubprogram(name: "arrayref;", linkageName: "arrayref", scope: !213, file: !213, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!239 = !DILocation(line: 103, scope: !212, inlinedAt: !211)
!240 = !{!241, !241, i64 0, i64 0}
!241 = !{!"custom_tbaa_addrspace(1)", !242, i64 0}
!242 = !{!"custom_tbaa"}
!243 = !DILocation(line: 38, scope: !227, inlinedAt: !244)
!244 = !DILocation(line: 0, scope: !230, inlinedAt: !245)
!245 = !DILocation(line: 0, scope: !246, inlinedAt: !247)
!246 = distinct !DISubprogram(name: "pointerset;", linkageName: "pointerset", scope: !231, file: !231, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!247 = !DILocation(line: 88, scope: !248, inlinedAt: !249)
!248 = distinct !DISubprogram(name: "unsafe_store!;", linkageName: "unsafe_store!", scope: !236, file: !236, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!249 = !DILocation(line: 88, scope: !250, inlinedAt: !251)
!250 = distinct !DISubprogram(name: "arrayset;", linkageName: "arrayset", scope: !213, file: !213, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!251 = !DILocation(line: 105, scope: !252, inlinedAt: !253)
!252 = distinct !DISubprogram(name: "setindex!;", linkageName: "setindex!", scope: !213, file: !213, type: !63, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !12, retainedNodes: !47)
!253 = !DILocation(line: 89, scope: !82, inlinedAt: !83)
Compiler error:
unable to legalize instruction: %248:_(s64) = 147 %241:_, %243:_
Context:
%248:_(s64) = 147 %241:_, %243:_
%241:_(s64), %242:_(s64) = 74 %49:_(s128)
%243:_(s64), %244:_(s64) = 74 %50:_(s128)
%49:_(s128) = 124 %26:_(s64)
%50:_(s128) = 91 %41:_(p1) :: (load (s64) from %ir.19 + 8, addrspace 1)
%26:_(s64) = 45 %25:_, %18:_
%41:_(p1) = 81 %105:_(s64)
%25:_(s64) = nsw 46 %8:_, %93:_
%18:_(s64) = 90 %16:_(p1) :: (load (s64) from %ir..elt3, addrspace 1)
%105:_(s64) = 45 %94:_, %104:_
%8:_(s64) = 126 %7:gpr32(s32)
%93:_(s64) = 120 i64 2
%16:_(p1) = 90 %17:_(p64) :: (dereferenceable load (p1) from @agc.buffer_pointers.3, addrspace 64)
%94:_(s64) = 80 %28:_(p1)
%104:_(s64) = 120 i64 40
%7:gpr32(s32) = 45 %0:_, %6:_
%17:_(p64) = 71 @agc.buffer_pointers.3
%28:_(p1) = 90 %15:_(p64) :: (dereferenceable load (p1) from @agc.buffer_pointers.1, addrspace 64)
%0:_(s32) = 116 intrinsic(@llvm.agx2.thread.position.in.grid.x)
Reduced:
define void @my_kernel({ i8 addrspace(1)*, [1 x i64] } addrspace(1)* %0, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)* %1, [1 x i64] addrspace(1)* %2, [2 x i64] addrspace(1)* %3, i32 %a, i32 %thread_position_in_grid) {
b:
%.c.d = load { i64, i64, i8, i8 }, { i64, i64, i8, i8 } addrspace(1)* null, align 4
%.e.2.0.1.extract = extractvalue { i64, i64, i8, i8 } %.c.d, 1
%4 = sext i64 %.e.2.0.1.extract to i128
%5 = mul i128 %4, -2
%6 = lshr i128 %5, 1
%7 = trunc i128 %6 to i64
%8 = getelementptr float, float addrspace(1)* null, i64 %7
%9 = load float, float addrspace(1)* %8, align 4
store float %9, float addrspace(1)* null, align 4
ret void
}
!air.kernel = !{!0}
!air.version = !{!8}
!0 = !{void ({ i8 addrspace(1)*, [1 x i64] } addrspace(1)*, { [1 x { i8 addrspace(1)*, [2 x i64] }], [1 x i64], [1 x { i64, i64, i8, i8 }] } addrspace(1)*, [1 x i64] addrspace(1)*, [2 x i64] addrspace(1)*, i32, i32)* @my_kernel, !1, !2}
!1 = !{}
!2 = !{!3, !3, !4, !5, !6, !7}
!3 = !{i32 1, !""}
!4 = !{i32 2, !""}
!5 = !{i32 3, !""}
!6 = !{i32 4, !""}
!7 = !{i32 5, !""}
!8 = !{i32 2, i32 5, i32 0}
This gives the same crash, I think:
unable to legalize instruction: %53:_(s64) = 147 %46:_, %48:_
Context:
%53:_(s64) = 147 %46:_, %48:_
%46:_(s64), %47:_(s64) = 74 %3:_(s128)
%48:_(s64), %49:_(s64) = 74 %4:_(s128)
%3:_(s128) = 91 %1:_(p1) :: (load (s64) from `i64 addrspace(1)* inttoptr (i64 8 to i64 addrspace(1)*)`, addrspace 1)
%4:_(s128) = 120 i128 36893488147419103230
%1:_(p1) = 81 %2:_(s64)
%2:_(s64) = 120 i64 8
(in function: agc.main.constant_program)
Bisected to JuliaGPU/GPUArrays.jl#512
This should at least yield nicer error messages
With the above:
Reason: unsupported use of i128 value
Stacktrace:
[1] toInt128
@ ./boot.jl:715
[2] Int128
@ ./boot.jl:785
[3] convert
@ ./number.jl:7
[4] widen
@ ./operators.jl:891
[5] _mul_high
@ ./multinverses.jl:139
[6] div
@ ./multinverses.jl:158
[7] divrem
@ ./multinverses.jl:172
[8] _ind2sub_rs
@ ./reshapedarray.jl:223
[9] ind2sub_rs
@ ./reshapedarray.jl:220
[10] _unsafe_getindex
@ ./reshapedarray.jl:260
[11] getindex
@ ./reshapedarray.jl:249
[12] macro expansion
@ ~/Julia/pkg/GPUArrays/src/host/indexing.jl:88
[13] getindex_kernel
@ ~/Julia/pkg/GPUArrays/src/host/indexing.jl:82
So the problem is that normally operations like view
and reshape
preserve the MtlArray
, however here the reshape
of an Adjoint
results in an actual ReshapedArray
. Indexing on that array wrapper is implemented (in Base) using Int128, which is already visible in the type signature:
SubArray{Float32, 1, Base.ReshapedArray{Float32, 1, LinearAlgebra.Adjoint{Float32, MtlMatrix{Float32, Private}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, Tuple{UnitRange{Int64}}, false}
LLVM normally supports legalizing such operations, but that only happens during ISel, and Apple's implementation doesn't seem to allow that. And legalizing i128 to i64 in IR seems tricky.
@timholy You originally added the ReshapedArray type; is there a way to opt out of the use of Int128, which I presume comes from the SignedMultiplicativeInverse{Int64}
indices? Alternatively, I guess we could overlay ind2sub_rs
, but that feels like a hack.
Sorry @maleadt I'm only now noticing this ping. I guess the problem is that there isn't anything to widen to? Any chance of adding https://github.com/rfourquet/BitIntegers.jl as a dependency?
Any chance of adding https://github.com/rfourquet/BitIntegers.jl as a dependency?
That could probably work. We have an alternative workaround now, with an implementation of mul_high
that doesn't widen (#379).