LoadError: KernelException on README example
lazarusA opened this issue · 5 comments
Hi, if I run the following:
using OrdinaryDiffEq, CUDA, LinearAlgebra
using DiffEqGPU
function lorenz(du, u, p, t)
du[1] = p[1] * (u[2] - u[1])
du[2] = u[1] * (p[2] - u[3]) - u[2]
du[3] = u[1] * u[2] - p[3] * u[3]
end
u0 = Float32[1.0; 0.0; 0.0]
tspan = (0.0f0, 100.0f0)
p = [10.0f0, 28.0f0, 8 / 3.0f0]
prob = ODEProblem(lorenz, u0, tspan, p)
prob_func = (prob, i, repeat) -> remake(prob, p = rand(Float32, 3) .* p)
monteprob = EnsembleProblem(prob, prob_func = prob_func, safetycopy = false)
sol = solve(monteprob, Tsit5(), EnsembleGPUArray(), trajectories = 10, saveat = 1.0f0)
I get the following error....
ERROR: a exception was thrown during kernel execution.
Run Julia on debug level 2 for device stack traces.
ERROR: LoadError: KernelException: exception thrown during kernel execution on device Tesla V100-DGXS-16GB
Stacktrace:
this is my env: (DiffEqGPU and DiffEqGPU#master show the same error).
~/JuliaConCUDA/Project.toml`
[621f4979] AbstractFFTs v1.1.0
[6e4b80f9] BenchmarkTools v1.2.2
[052768ef] CUDA v3.8.0
[72cfdca4] CUDAKernels v0.2.1
[3da002f7] ColorTypes v0.11.0
[5ae59095] Colors v0.12.8
[071ae1c0] DiffEqGPU v1.15.0 `https://github.com/SciML/DiffEqGPU.jl.git#master`
[5789e2e9] FileIO v1.13.0
[53c48c17] FixedPointNumbers v0.8.4
[f332f351] ImageContrastAdjustment v0.3.10
[a09fc81d] ImageCore v0.9.3
[6a3955dd] ImageFiltering v0.7.1
[6218d12a] ImageMagick v1.2.2
[4e3cecfd] ImageShow v0.3.3
[63c18a36] KernelAbstractions v0.6.3
[1dea7af3] OrdinaryDiffEq v6.6.6
[62fd8b95] TensorCore v0.1.1
[5e47fb64] TestImages v1.6.2
[bc48ee85] Tullio v0.3.3
[37e2e46d] LinearAlgebra
Try:
function lorenz(du,u,p,t)
@inbounds begin
du[1] = p[1]*(u[2]-u[1])
du[2] = u[1]*(p[2]-u[3]) - u[2]
du[3] = u[1]*u[2] - p[3]*u[3]
end
end
same output. It would be nice to have Project file under which we know that things work. Probably, I do have the wrong combination of dependencies. Please see the complete new example again, with a clean env.
using OrdinaryDiffEq, CUDA, LinearAlgebra
using DiffEqGPU
function lorenz(du,u,p,t)
@inbounds begin
du[1] = p[1]*(u[2]-u[1])
du[2] = u[1]*(p[2]-u[3]) - u[2]
du[3] = u[1]*u[2] - p[3]*u[3]
end
end
u0 = Float32[1.0; 0.0; 0.0]
tspan = (0.0f0, 100.0f0)
p = [10.0f0, 28.0f0, 8 / 3.0f0]
prob = ODEProblem(lorenz, u0, tspan, p)
prob_func = (prob, i, repeat) -> remake(prob, p = rand(Float32, 3) .* p)
monteprob = EnsembleProblem(prob, prob_func = prob_func, safetycopy = false)
sol = solve(monteprob, Tsit5(), EnsembleGPUArray(), trajectories = 10, saveat = 1.0f0)
my current env:
(JuliaConCUDA) pkg> st
Status `~/JuliaConCUDA/Project.toml`
[052768ef] CUDA v3.8.0
[071ae1c0] DiffEqGPU v1.15.0
[1dea7af3] OrdinaryDiffEq v6.6.6
[37e2e46d] LinearAlgebra
and the output after doing:
~/JuliaConCUDA$ julia -g2 --project testDiffs.jl
ERROR: a exception was thrown during kernel execution.
Stacktrace:
ERROR: a exception was thrown during kernel execution.
Stacktrace:
ERROR: a exception was thrown during kernel execution.
Stacktrace:
ERROR: a exception was thrown during kernel execution.
Stacktrace:
ERROR: a exception was thrown during kernel execution.
Stacktrace:
ERROR: a exception was thrown during kernel execution.
Stacktrace:
ERROR: a exception was thrown during kernel execution.
Stacktrace:
ERROR: a exception was thrown during kernel execution.
Stacktrace:
[1] error(::String) at ./error.jl:33
[1] error(::String) at ./error.jl:33
[1] error(::String) at ./error.jl:33
[1] error(::String) at ./error.jl:33
[1] error(::String) at ./error.jl:33
[1] error(::String) at ./error.jl:33
[1] error(::String) at ./error.jl:33
[1] error(::String) at ./error.jl:33
[2] overdub at ./error.jl:33
[2] overdub at ./error.jl:33
[2] overdub at ./error.jl:33
[2] overdub at ./error.jl:33
[2] overdub at ./error.jl:33
[2] overdub at ./error.jl:33
[2] overdub at ./error.jl:33
[2] overdub at ./error.jl:33
[3] const_arrayref(::CuDeviceMatrix{Float32, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[3] const_arrayref(::CuDeviceMatrix{Float32, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[3] const_arrayref(::CuDeviceMatrix{Float32, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[3] const_arrayref(::CuDeviceMatrix{Float32, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[3] const_arrayref(::CuDeviceMatrix{Float32, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[3] const_arrayref(::CuDeviceMatrix{Float32, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[3] const_arrayref(::CuDeviceMatrix{Float32, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[3] const_arrayref(::CuDeviceMatrix{Float32, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[4] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[4] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[4] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[4] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[4] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[4] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[4] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[4] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/utils.jl:49
[5] getindex(::CUDA.Const{Float32, 2, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[5] getindex(::CUDA.Const{Float32, 2, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[5] getindex(::CUDA.Const{Float32, 2, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[5] getindex(::CUDA.Const{Float32, 2, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[5] getindex(::CUDA.Const{Float32, 2, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[5] getindex(::CUDA.Const{Float32, 2, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[5] getindex(::CUDA.Const{Float32, 2, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[5] getindex(::CUDA.Const{Float32, 2, 1}, ::Int64) at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[6] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[6] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[6] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[6] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[6] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[6] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[6] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[6] overdub at /User/homes/lalonso/.julia/packages/CUDA/bki2w/src/device/array.jl:232
[7] overdub at ./subarray.jl:309
[7] overdub at ./subarray.jl:309
[7] overdub at ./subarray.jl:309
[7] overdub at ./subarray.jl:309
[7] overdub at ./subarray.jl:309
[7] overdub at ./subarray.jl:309
[7] overdub at ./subarray.jl:309
[7] overdub at ./subarray.jl:309
[8] overdub at /User/homes/lalonso/JuliaConCUDA/testDiffs.jl:5
[8] overdub at /User/homes/lalonso/JuliaConCUDA/testDiffs.jl:5
[8] overdub at /User/homes/lalonso/JuliaConCUDA/testDiffs.jl:5
[8] overdub at /User/homes/lalonso/JuliaConCUDA/testDiffs.jl:5
[8] overdub at /User/homes/lalonso/JuliaConCUDA/testDiffs.jl:5
[8] overdub at /User/homes/lalonso/JuliaConCUDA/testDiffs.jl:5
[8] overdub at /User/homes/lalonso/JuliaConCUDA/testDiffs.jl:5
[8] overdub at /User/homes/lalonso/JuliaConCUDA/testDiffs.jl:5
[9] macro expansion at /User/homes/lalonso/.julia/packages/DiffEqGPU/Ibo20/src/DiffEqGPU.jl:20
[9] macro expansion at /User/homes/lalonso/.julia/packages/DiffEqGPU/Ibo20/src/DiffEqGPU.jl:20
[9] macro expansion at /User/homes/lalonso/.julia/packages/DiffEqGPU/Ibo20/src/DiffEqGPU.jl:20
[9] macro expansion at /User/homes/lalonso/.julia/packages/DiffEqGPU/Ibo20/src/DiffEqGPU.jl:20
[9] macro expansion at /User/homes/lalonso/.julia/packages/DiffEqGPU/Ibo20/src/DiffEqGPU.jl:20
[9] macro expansion at /User/homes/lalonso/.julia/packages/DiffEqGPU/Ibo20/src/DiffEqGPU.jl:20
[9] macro expansion at /User/homes/lalonso/.julia/packages/DiffEqGPU/Ibo20/src/DiffEqGPU.jl:20
[9] macro expansion at /User/homes/lalonso/.julia/packages/DiffEqGPU/Ibo20/src/DiffEqGPU.jl:20
[10] overdub at /User/homes/lalonso/.julia/packages/KernelAbstractions/Yy47c/src/macros.jl:80
[10] overdub at /User/homes/lalonso/.julia/packages/KernelAbstractions/Yy47c/src/macros.jl:80
[10] overdub at /User/homes/lalonso/.julia/packages/KernelAbstractions/Yy47c/src/macros.jl:80
[10] overdub at /User/homes/lalonso/.julia/packages/KernelAbstractions/Yy47c/src/macros.jl:80
[10] overdub at /User/homes/lalonso/.julia/packages/KernelAbstractions/Yy47c/src/macros.jl:80
[10] overdub at /User/homes/lalonso/.julia/packages/KernelAbstractions/Yy47c/src/macros.jl:80
[10] overdub at /User/homes/lalonso/.julia/packages/KernelAbstractions/Yy47c/src/macros.jl:80
[10] overdub at /User/homes/lalonso/.julia/packages/KernelAbstractions/Yy47c/src/macros.jl:80
[11] overdub at /User/homes/lalonso/.julia/packages/Cassette/1lyEM/src/overdub.jl:0
[11] overdub at /User/homes/lalonso/.julia/packages/Cassette/1lyEM/src/overdub.jl:0
[11] overdub at /User/homes/lalonso/.julia/packages/Cassette/1lyEM/src/overdub.jl:0
[11] overdub at /User/homes/lalonso/.julia/packages/Cassette/1lyEM/src/overdub.jl:0
[11] overdub at /User/homes/lalonso/.julia/packages/Cassette/1lyEM/src/overdub.jl:0
[11] overdub at /User/homes/lalonso/.julia/packages/Cassette/1lyEM/src/overdub.jl:0
[11] overdub at /User/homes/lalonso/.julia/packages/Cassette/1lyEM/src/overdub.jl:0
[11] overdub at /User/homes/lalonso/.julia/packages/Cassette/1lyEM/src/overdub.jl:0
ERROR: LoadError: KernelException: exception thrown during kernel execution on device Tesla V100-DGXS-16GB
Stacktrace:
[1] check_exceptions()
@ CUDA ~/.julia/packages/CUDA/bki2w/src/compiler/exceptions.jl:34
[2] nonblocking_synchronize
@ ~/.julia/packages/CUDA/bki2w/lib/cudadrv/context.jl:329 [inlined]
[3] device_synchronize()
@ CUDA ~/.julia/packages/CUDA/bki2w/lib/cudadrv/context.jl:317
[4] CuModule(data::Vector{UInt8}, options::Dict{CUDA.CUjit_option_enum, Any})
@ CUDA ~/.julia/packages/CUDA/bki2w/lib/cudadrv/module.jl:41
[5] CuModule
@ ~/.julia/packages/CUDA/bki2w/lib/cudadrv/module.jl:23 [inlined]
[6] cufunction_link(job::GPUCompiler.CompilerJob, compiled::NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}})
@ CUDA ~/.julia/packages/CUDA/bki2w/src/compiler/execution.jl:451
[7] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
@ GPUCompiler ~/.julia/packages/GPUCompiler/1Ajz2/src/cache.jl:95
[8] cufunction(f::GPUArrays.var"#broadcast_kernel#17", tt::Type{Tuple{CUDA.CuKernelContext, CuDeviceMatrix{Float32, 1}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(muladd), Tuple{Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Nothing, typeof(DiffEqGPU.diffeqgpunorm), Tuple{Base.Broadcast.Extruded{CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Float32}}, Float32, Float32}}, Int64}}; name::Nothing, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ CUDA ~/.julia/packages/CUDA/bki2w/src/compiler/execution.jl:297
[9] cufunction
@ ~/.julia/packages/CUDA/bki2w/src/compiler/execution.jl:291 [inlined]
[10] macro expansion
@ ~/.julia/packages/CUDA/bki2w/src/compiler/execution.jl:102 [inlined]
[11] #launch_heuristic#268
@ ~/.julia/packages/CUDA/bki2w/src/gpuarrays.jl:17 [inlined]
[12] copyto!
@ ~/.julia/packages/GPUArrays/umZob/src/host/broadcast.jl:65 [inlined]
[13] copyto!
@ ./broadcast.jl:936 [inlined]
[14] materialize!
@ ./broadcast.jl:894 [inlined]
[15] materialize!
@ ./broadcast.jl:891 [inlined]
[16] fast_materialize!
@ ~/.julia/packages/FastBroadcast/yCuxg/src/FastBroadcast.jl:31 [inlined]
[17] ode_determine_initdt(u0::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, t::Float32, tdir::Float32, dtmax::Float32, abstol::Float32, reltol::Float32, internalnorm::typeof(DiffEqGPU.diffeqgpunorm), prob::ODEProblem{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Float32, Float32}, true, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ODEFunction{true, DiffEqGPU.var"#55#59"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, integrator::OrdinaryDiffEq.ODEIntegrator{Tsit5{typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}, true, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Nothing, Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Float32, Float32, Float32, Float32, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, ODESolution{Float32, 3, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Nothing, Nothing, Vector{Float32}, Vector{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, ODEProblem{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Float32, Float32}, true, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ODEFunction{true, DiffEqGPU.var"#55#59"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, Tsit5{typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}, OrdinaryDiffEq.InterpolationData{ODEFunction{true, DiffEqGPU.var"#55#59"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Vector{Float32}, Vector{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, OrdinaryDiffEq.Tsit5Cache{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, OrdinaryDiffEq.Tsit5ConstantCache{Float32, Float32}, typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}}, DiffEqBase.DEStats}, ODEFunction{true, DiffEqGPU.var"#55#59"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, OrdinaryDiffEq.Tsit5Cache{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, OrdinaryDiffEq.Tsit5ConstantCache{Float32, Float32}, typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}, OrdinaryDiffEq.DEOptions{Float32, Float32, Float32, Float32, PIController{Rational{Int64}}, typeof(DiffEqGPU.diffeqgpunorm), typeof(opnorm), Nothing, CallbackSet{Tuple{}, Tuple{}}, typeof(DiffEqBase.ODE_DEFAULT_ISOUTOFDOMAIN), typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE), DiffEqGPU.var"#12#18", DataStructures.BinaryHeap{Float32, DataStructures.FasterForward}, DataStructures.BinaryHeap{Float32, DataStructures.FasterForward}, Nothing, Nothing, Int64, Tuple{}, Float32, Tuple{}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Float32, Nothing, OrdinaryDiffEq.DefaultInit})
@ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/Op0Oq/src/initdt.jl:23
[18] auto_dt_reset!
@ ~/.julia/packages/OrdinaryDiffEq/Op0Oq/src/integrators/integrator_interface.jl:346 [inlined]
[19] handle_dt!(integrator::OrdinaryDiffEq.ODEIntegrator{Tsit5{typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}, true, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Nothing, Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Float32, Float32, Float32, Float32, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, ODESolution{Float32, 3, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Nothing, Nothing, Vector{Float32}, Vector{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, ODEProblem{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Float32, Float32}, true, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ODEFunction{true, DiffEqGPU.var"#55#59"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, Tsit5{typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}, OrdinaryDiffEq.InterpolationData{ODEFunction{true, DiffEqGPU.var"#55#59"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Vector{Float32}, Vector{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, OrdinaryDiffEq.Tsit5Cache{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, OrdinaryDiffEq.Tsit5ConstantCache{Float32, Float32}, typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}}, DiffEqBase.DEStats}, ODEFunction{true, DiffEqGPU.var"#55#59"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, OrdinaryDiffEq.Tsit5Cache{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, OrdinaryDiffEq.Tsit5ConstantCache{Float32, Float32}, typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}, OrdinaryDiffEq.DEOptions{Float32, Float32, Float32, Float32, PIController{Rational{Int64}}, typeof(DiffEqGPU.diffeqgpunorm), typeof(opnorm), Nothing, CallbackSet{Tuple{}, Tuple{}}, typeof(DiffEqBase.ODE_DEFAULT_ISOUTOFDOMAIN), typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE), DiffEqGPU.var"#12#18", DataStructures.BinaryHeap{Float32, DataStructures.FasterForward}, DataStructures.BinaryHeap{Float32, DataStructures.FasterForward}, Nothing, Nothing, Int64, Tuple{}, Float32, Tuple{}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Float32, Nothing, OrdinaryDiffEq.DefaultInit})
@ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/Op0Oq/src/solve.jl:504
[20] __init(prob::ODEProblem{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Float32, Float32}, true, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ODEFunction{true, DiffEqGPU.var"#55#59"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, alg::Tsit5{typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}, timeseries_init::Tuple{}, ts_init::Tuple{}, ks_init::Tuple{}, recompile::Type{Val{true}}; saveat::Float32, tstops::Tuple{}, d_discontinuities::Tuple{}, save_idxs::Nothing, save_everystep::Bool, save_on::Bool, save_start::Bool, save_end::Nothing, callback::Nothing, dense::Bool, calck::Bool, dt::Float32, dtmin::Nothing, dtmax::Float32, force_dtmin::Bool, adaptive::Bool, gamma::Rational{Int64}, abstol::Nothing, reltol::Nothing, qmin::Rational{Int64}, qmax::Int64, qsteady_min::Int64, qsteady_max::Int64, beta1::Nothing, beta2::Nothing, qoldinit::Rational{Int64}, controller::Nothing, fullnormalize::Bool, failfactor::Int64, maxiters::Int64, internalnorm::typeof(DiffEqGPU.diffeqgpunorm), internalopnorm::typeof(opnorm), isoutofdomain::typeof(DiffEqBase.ODE_DEFAULT_ISOUTOFDOMAIN), unstable_check::DiffEqGPU.var"#12#18", verbose::Bool, timeseries_errors::Bool, dense_errors::Bool, advance_to_tstop::Bool, stop_at_next_tstop::Bool, initialize_save::Bool, progress::Bool, progress_steps::Int64, progress_name::String, progress_message::typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE), userdata::Nothing, allow_extrapolation::Bool, initialize_integrator::Bool, alias_u0::Bool, alias_du0::Bool, initializealg::OrdinaryDiffEq.DefaultInit, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/Op0Oq/src/solve.jl:466
[21] #__solve#495
@ ~/.julia/packages/OrdinaryDiffEq/Op0Oq/src/solve.jl:4 [inlined]
[22] #solve_call#37
@ ~/.julia/packages/DiffEqBase/1V2xg/src/solve.jl:61 [inlined]
[23] solve_up(prob::ODEProblem{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Float32, Float32}, true, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ODEFunction{true, DiffEqGPU.var"#55#59"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, sensealg::Nothing, u0::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, p::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, args::Tsit5{typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}; kwargs::Base.Iterators.Pairs{Symbol, Any, NTuple{5, Symbol}, NamedTuple{(:unstable_check, :saveat, :callback, :merge_callbacks, :internalnorm), Tuple{DiffEqGPU.var"#12#18", Float32, Nothing, Bool, typeof(DiffEqGPU.diffeqgpunorm)}}})
@ DiffEqBase ~/.julia/packages/DiffEqBase/1V2xg/src/solve.jl:87
[24] #solve#38
@ ~/.julia/packages/DiffEqBase/1V2xg/src/solve.jl:73 [inlined]
[25] batch_solve_up(ensembleprob::EnsembleProblem{ODEProblem{Vector{Float32}, Tuple{Float32, Float32}, true, Vector{Float32}, ODEFunction{true, typeof(lorenz), UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, var"#1#2", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, probs::Vector{ODEProblem{Vector{Float32}, Tuple{Float32, Float32}, true, Vector{Float32}, ODEFunction{true, typeof(lorenz), UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}}, alg::Tsit5{typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}, ensemblealg::EnsembleGPUArray, I::UnitRange{Int64}, u0::Matrix{Float32}, p::Matrix{Float32}; kwargs::Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:unstable_check, :saveat), Tuple{DiffEqGPU.var"#12#18", Float32}}})
@ DiffEqGPU ~/.julia/packages/DiffEqGPU/Ibo20/src/DiffEqGPU.jl:319
[26] batch_solve(ensembleprob::EnsembleProblem{ODEProblem{Vector{Float32}, Tuple{Float32, Float32}, true, Vector{Float32}, ODEFunction{true, typeof(lorenz), UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, var"#1#2", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, alg::Tsit5{typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}, ensemblealg::EnsembleGPUArray, I::UnitRange{Int64}; kwargs::Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:unstable_check, :saveat), Tuple{DiffEqGPU.var"#12#18", Float32}}})
@ DiffEqGPU ~/.julia/packages/DiffEqGPU/Ibo20/src/DiffEqGPU.jl:284
[27] macro expansion
@ ./timing.jl:287 [inlined]
[28] __solve(ensembleprob::EnsembleProblem{ODEProblem{Vector{Float32}, Tuple{Float32, Float32}, true, Vector{Float32}, ODEFunction{true, typeof(lorenz), UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, var"#1#2", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, alg::Tsit5{typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}, ensemblealg::EnsembleGPUArray; trajectories::Int64, batch_size::Int64, unstable_check::Function, kwargs::Base.Iterators.Pairs{Symbol, Float32, Tuple{Symbol}, NamedTuple{(:saveat,), Tuple{Float32}}})
@ DiffEqGPU ~/.julia/packages/DiffEqGPU/Ibo20/src/DiffEqGPU.jl:201
[29] #solve#40
@ ~/.julia/packages/DiffEqBase/1V2xg/src/solve.jl:101 [inlined]
[30] top-level scope
@ ~/JuliaConCUDA/testDiffs.jl:16
in expression starting at /User/homes/lalonso/JuliaConCUDA/testDiffs.jl:16
@maleadt could I get help on this one? It's really weird. In the package, I changed the code at the spot it's erroring at to:
Main.x[] = (sk,abstol,u0,t,reltol)
@.. sk = abstol+internalnorm(u0,t)*reltol
then I did:
using DiffEqGPU, OrdinaryDiffEq
function lorenz(du,u,p,t)
du[1] = p[1]*(u[2]-u[1])
du[2] = u[1]*(p[2]-u[3]) - u[2]
du[3] = u[1]*u[2] - p[3]*u[3]
end
u0 = Float32[1.0;0.0;0.0]
tspan = (0.0f0,100.0f0)
p = [10.0f0,28.0f0,8/3f0]
prob = ODEProblem(lorenz,u0,tspan,p)
prob_func = (prob,i,repeat) -> remake(prob,p=rand(Float32,3).*p)
monteprob = EnsembleProblem(prob, prob_func = prob_func, safetycopy=false)
@time sol = solve(monteprob,Tsit5(),EnsembleGPUArray(),trajectories=10_000,saveat=1.0f0,reltol=1f-6,abstol=1f-6
)
x = Ref{Any}()
using CUDA
u = CuArray(rand(Float32,3,8000))
reltol = 1f-6
abstol=1f-6
sk = CuArray(rand(Float32,3,8000))
DiffEqBase.@.. x[][1] = x[][2]+DiffEqBase.ODE_DEFAULT_NORM(x[][3],x[][4])*x[][5]
so magically, the same expression is working at the top level scope, but I get a kernel compilation error when that expression with the same arguments is in the package. Do you know what could cause this behavior?
It was a version dependency issue. Fixed by updating the KernelAbstractions versions.
Thanks. It works now.