JuliaGPU/Metal.jl

Metal.jl produces incorrect (incomplete) results with DiffEqGPU on Julia v1.10

Opened this issue · 1 comments

https://buildkite.com/julialang/diffeqgpu-dot-jl/builds/1005#018cf3fe-98ac-455f-9065-0204cd3728dd

MWE:

using DiffEqGPU, SciMLBase, StaticArrays, LinearAlgebra

using Metal

backend = MetalBackend()

function lorenz(u, p, t)
    σ = p[1]
    ρ = p[2]
    β = p[3]
    du1 = σ * (u[2] - u[1])
    du2 = u[1] *- u[3]) - u[2]
    du3 = u[1] * u[2] - β * u[3]
    return SVector{3}(du1, du2, du3)
end

u0 = @SVector [1.0f0; 0.0f0; 0.0f0]
tspan = (0.0f0, 10.0f0)
p = @SVector [10.0f0, 28.0f0, 8 / 3.0f0]
prob = ODEProblem{false}(lorenz, u0, tspan, p)

using Test

alg = GPUTsit5()

prob_func = (prob, i, repeat) -> remake(prob, p = p)
monteprob = EnsembleProblem(prob, prob_func = prob_func, safetycopy = false)
@info typeof(alg)

asol = solve(monteprob, alg, EnsembleGPUKernel(backend), trajectories = 10,
    adaptive = true, dt = 0.1f-1)

asol[1] # Didn't compute the solution

I tried everything on the main as well:

Status `~/test_DiffEqGPU/Project.toml`
  [071ae1c0] DiffEqGPU v3.4.0 `~/.julia/dev/DiffEqGPU`
  [61eb1bfa] GPUCompiler v0.25.0 `https://github.com/JuliaGPU/GPUCompiler.jl.git#master`
  [dde4c033] Metal v0.5.1 `https://github.com/JuliaGPU/Metal.jl.git#main`

Apparently, some of the kernels are working fine (e.g., fixed-time stepping ones), and even other backends are fine.

Can you isolate this to the actual kernels that produce different results? It's the first step I would have to do anyway, so it would be better if somebody with DiffEqGPU experience takes care of that.