JuliaGPU/Metal.jl

`Metal.code_agx()` failing in MacOS 15 Beta 3

christiangnrd opened this issue · 2 comments

julia> using Metal

julia> Metal.versioninfo()
macOS 15.0.0, Darwin 24.0.0

Toolchain:
- Julia: 1.10.4
- LLVM: 15.0.7

Julia packages: 
- Metal.jl: 1.2.0
- LLVMDowngrader_jll: 0.3.0+1

1 device:
- Apple M2 Max (64.000 KiB allocated)

julia> dummy() = return
dummy (generic function with 1 method)

julia> Metal.code_agx(dummy, Tuple{})
ERROR: Mach-O file does not contain Symtab load commands
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] ObjectFile.MachO.MachOSymbols(oh::ObjectFile.MachO.MachOHandle{IOStream})
    @ ObjectFile.MachO ~/.julia/packages/ObjectFile/2udxC/src/MachO/MachOSymbol.jl:59
  [3] ObjectFile.Symbols(oh::ObjectFile.MachO.MachOHandle{IOStream})
    @ ObjectFile.MachO ~/.julia/packages/ObjectFile/2udxC/src/MachO/MachOSymbol.jl:64
  [4] extract_gpu_code(f::Metal.var"#179#182"{String, Base.TTY}, binary::String)
    @ Metal ~/.julia/dev/Metal/src/compiler/reflection.jl:114
  [5] #178
    @ ~/.julia/dev/Metal/src/compiler/reflection.jl:69 [inlined]
  [6] mktempdir(fn::Metal.var"#178#181"{Metal.MTL.MTLBinaryArchiveInstance, Base.TTY}, parent::String; prefix::String)
    @ Base.Filesystem ./file.jl:766
  [7] mktempdir (repeats 2 times)
    @ ./file.jl:762 [inlined]
  [8] macro expansion
    @ ~/.julia/dev/Metal/src/compiler/reflection.jl:61 [inlined]
  [9] (::Metal.var"#177#180"{Base.TTY, GPUCompiler.CompilerJob{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}})()
    @ Metal ~/.julia/packages/ObjectiveC/uWFwg/src/foundation.jl:637
 [10] macro expansion
    @ ~/.julia/packages/ObjectiveC/uWFwg/src/foundation.jl:565 [inlined]
 [11] macro expansion
    @ ./lock.jl:267 [inlined]
 [12] ObjectiveC.Foundation.NSAutoreleasePool(f::Metal.var"#177#180"{Base.TTY, GPUCompiler.CompilerJob{…}})
    @ ObjectiveC.Foundation ~/.julia/packages/ObjectiveC/uWFwg/src/foundation.jl:557
 [13] code_agx
    @ ~/.julia/packages/ObjectiveC/uWFwg/src/foundation.jl:636 [inlined]
 [14] code_agx(io::Base.TTY, func::Any, types::Any, kernel::Bool; kwargs::@Kwargs{})
    @ Metal ~/.julia/dev/Metal/src/compiler/reflection.jl:36
 [15] code_agx
    @ ~/.julia/dev/Metal/src/compiler/reflection.jl:30 [inlined]
 [16] code_agx(io::Base.TTY, func::Any, types::Any)
    @ Metal ~/.julia/dev/Metal/src/compiler/reflection.jl:30
 [17] code_agx(func::Any, types::Any; kwargs::@Kwargs{})
    @ Metal ~/.julia/dev/Metal/src/compiler/reflection.jl:160
 [18] top-level scope
    @ REPL[3]:1
 [19] top-level scope
    @ ~/.julia/dev/Metal/src/initialization.jl:58
Some type information was truncated. Use `show(err)` to see complete types.

Ah, that's too bad. Now, code_agx was already pretty useless on M3's, so I wonder if we should get rid of it (and its Python_jll dependency):

julia> @device_code_agx @metal identity(nothing)
; GPUCompiler.CompilerJob{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}(MethodInstance for identity(::Nothing), CompilerConfig for GPUCompiler.MetalCompilerTarget, 0x0000000000007b0c)

___Z8identityv._agc.main.constant_program:
   0: 0e00000006000600     iadd             r0l, 0, r16l
   8: 0600                 <disassembly failed>
   a: 0600                 <disassembly failed>
   c: 0600                 <disassembly failed>
   e: 0600                 <disassembly failed>
  10: 0600                 <disassembly failed>
  12: 0600                 <disassembly failed>
  14: 0600                 <disassembly failed>
  16: 0600                 <disassembly failed>
  18: 0600                 <disassembly failed>
  1a: 0600                 <disassembly failed>
  1c: 0600                 <disassembly failed>
  1e: 0600                 <disassembly failed>
  20: 0600                 <disassembly failed>
  22: 0600                 <disassembly failed>
  24: 0600                 <disassembly failed>
  26: 0600                 <disassembly failed>
  28: 0600                 <disassembly failed>
  2a: 0600                 <disassembly failed>
  2c: 0600                 <disassembly failed>
  2e: 0600                 <disassembly failed>
  30: 0600                 <disassembly failed>
  32: 0600                 <disassembly failed>
  34: 0600                 <disassembly failed>
  36: 0600                 <disassembly failed>
  38: 0600                 <disassembly failed>
  3a: 0600                 <disassembly failed>
  3c: 0600                 <disassembly failed>
  3e: 0600                 <disassembly failed>

___Z8identityv._agc.main:
   0: 0e000000             iadd             r0l, 0, 0

To be honest the code_X functions aren’t something I’ve taken the time to familiarize myself with yet so I don’t think I can really weigh in on the value of trying to fix/keep code_agx over just getting rid of it.

However, if we do get rid of them, that’s a breaking change so I’d take the opportunity to unexport MTL as discussed in #359.