Metal.jl produces incorrect (incomplete) results with DiffEqGPU on Julia v1.10
Opened this issue · 1 comments
utkarsh530 commented
https://buildkite.com/julialang/diffeqgpu-dot-jl/builds/1005#018cf3fe-98ac-455f-9065-0204cd3728dd
MWE:
using DiffEqGPU, SciMLBase, StaticArrays, LinearAlgebra
using Metal
backend = MetalBackend()
function lorenz(u, p, t)
σ = p[1]
ρ = p[2]
β = p[3]
du1 = σ * (u[2] - u[1])
du2 = u[1] * (ρ - u[3]) - u[2]
du3 = u[1] * u[2] - β * u[3]
return SVector{3}(du1, du2, du3)
end
u0 = @SVector [1.0f0; 0.0f0; 0.0f0]
tspan = (0.0f0, 10.0f0)
p = @SVector [10.0f0, 28.0f0, 8 / 3.0f0]
prob = ODEProblem{false}(lorenz, u0, tspan, p)
using Test
alg = GPUTsit5()
prob_func = (prob, i, repeat) -> remake(prob, p = p)
monteprob = EnsembleProblem(prob, prob_func = prob_func, safetycopy = false)
@info typeof(alg)
asol = solve(monteprob, alg, EnsembleGPUKernel(backend), trajectories = 10,
adaptive = true, dt = 0.1f-1)
asol[1] # Didn't compute the solution
I tried everything on the main as well:
Status `~/test_DiffEqGPU/Project.toml`
[071ae1c0] DiffEqGPU v3.4.0 `~/.julia/dev/DiffEqGPU`
[61eb1bfa] GPUCompiler v0.25.0 `https://github.com/JuliaGPU/GPUCompiler.jl.git#master`
[dde4c033] Metal v0.5.1 `https://github.com/JuliaGPU/Metal.jl.git#main`
Apparently, some of the kernels are working fine (e.g., fixed-time stepping ones), and even other backends are fine.
maleadt commented
Can you isolate this to the actual kernels that produce different results? It's the first step I would have to do anyway, so it would be better if somebody with DiffEqGPU experience takes care of that.