ContinuousCallback does not work correctly with EnsembleGPUArray()
martin-abrudsky opened this issue · 8 comments
Hello, I am solving a central potential problem with EnsembleGPUArray() for multiple initial conditions, when I add the ContinuousCallback the first 800 trajectories ignore the affect!(integrator) and the last 200 do the cut correctly. If I solve for 10_000 trajectories, the first 8000 trajectories ignore the cut and the last 2000 do it correctly, similarly for 1_000_000 trajectories. the code is the following
using CUDA
using DiffEqGPU
using NPZ
using OrdinaryDiffEq
using Plots
"""
pot_central!(du,u,p,t)
u=[x,dx,y,dy,# of steps]
p=[k,m,dt]
"""
function pot_central!(du,u,p,t)
r3 = ( u[1]^2 + u[3]^2 )^(3/2)
du[1] = u[2] # u[2]= dx
du[2] = -( p[1]*u[1] ) / ( p[2]*r3 )
du[3] = u[4] # u[4]= dy
du[4] = -( p[1]*u[3] ) / ( p[2]*r3 )
du[5] = 1/p[3]
end
T=100.0
trajectories=1_000
dt=0.01
u_rand = convert(Array{Float64},npzread("IO_GPU/IO_u0.npy"))
u0 = [2.0, 2.0, 1.0, 1.5,dt]
p = [1.0,1.0,dt] #[k,m,dt]
tspan = (0.0,T)
R2 = [4.5,5_000.0] # R2=[Rmin2,Rmax2]
prob= ODEProblem{true}(pot_central!,u0,tspan,p)
prob_func= (prob,i,repeat) -> remake(prob,u0 = [u_rand[i,:];dt].*u0 + [1.0,1.0,1.0,1.0,dt])
Ensemble_Problem=EnsembleProblem(prob,prob_func=prob_func,safetycopy=false)
function condition(u,t,integrator)
r2 = u[1]^2 + u[3]^2
(R2[2] - r2)*(r2 - R2[1])
end
affect!(integrator) = terminate!(integrator)
gpu_cb = ContinuousCallback(condition, affect!;save_positions=(true,false),rootfind=true,interp_points=0,abstol=1e-7,reltol=0)
sol= solve(Ensemble_Problem,
Tsit5(),
EnsembleGPUArray(),
dt=0.01,
trajectories=trajectories,
#batch_size=10_000,
callback = gpu_cb,
adaptive=false,
save_everystep=false
)
`
The solutions for some trajectories are as follows
`
#u=[x,dx,y,dy,# of steps]
sol[1]
retcode: Success
Interpolation: 1st order linear
t: 2-element Vector{Float64}:
0.0
100.0
u: 2-element Vector{SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}}:
[1.6916643844888346, 1.3952481788899702, 1.4257929742644806, 2.3254414019025376, 0.0101]
[130.3632437338083, 1.283605087637319, 221.2362706355311, 2.193291036113921, 10000.010099999996]
sol[800]
retcode: Success
Interpolation: 1st order linear
t: 2-element Vector{Float64}:
0.0
100.0
u: 2-element Vector{SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}}:
[2.3286232359143177, 2.0849960853408054, 1.5481395680210261, 1.0878883702422468, 0.0101]
[198.03489179352627, 1.9510332210035648, 102.76307791913189, 1.008911077728196, 10000.010099999996]
sol[801]
retcode: Terminated
Interpolation: 1st order linear
t: 2-element Vector{Float64}:
0.0
28.22912660730258
u: 2-element Vector{Vector{Float64}}:
[1.3698092340031855, 2.018364836988968, 1.9808709009626209, 1.5912143969170325, 0.0101]
[55.503066919039185, 1.9067772301987758, 43.81106666791687, 1.4723417477028182, 2822.922760730258]
sol[802]
retcode: Terminated
Interpolation: 1st order linear
t: 2-element Vector{Float64}:
0.0
30.891877286483595
u: 2-element Vector{Vector{Float64}}:
[2.2032317939655903, 1.2169273303092267, 1.6948012447348866, 2.0018032820652945, 0.0101]
[36.85704302055922, 1.1131346681788845, 60.34532608065554, 1.8862192286561592, 3089.1978286483586]
From already thank you very much
Can you try the following:
sol= solve(Ensemble_Problem,
Tsit5(),
EnsembleGPUArray(),
dt=0.01,
trajectories=trajectories,
#batch_size=10_000,
callback = gpu_cb,
adaptive=false,
save_everystep=false,
merge_callbacks = true
)
Hello, I tried your recommendation but it gives me this error
Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?79e64a19-8dad-4456-bf40-7452538e8508)
InvalidIRError: compiling kernel #gpu_continuous_condition_kernel(KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, typeof(condition), CuDeviceVector{Float64, 1}, CuDeviceMatrix{Float64, 1}, Float64, CuDeviceMatrix{Float64, 1}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to convert)
Stacktrace:
[1] setindex!
@ ~/.julia/packages/CUDA/DfvRa/src/device/array.jl:194
[2] macro expansion
@ ~/.julia/packages/DiffEqGPU/CiiCq/src/DiffEqGPU.jl:63
[3] gpu_continuous_condition_kernel
@ ~/.julia/packages/KernelAbstractions/C8flJ/src/macros.jl:81
[4] gpu_continuous_condition_kernel
@ ./none:0
Reason: unsupported dynamic function invocation (call to getindex)
Stacktrace:
[1] condition
@ ~/FAMAF/Beca_CIN_Trabajo_Final/skymap/GPU_Julia/pot_central_GPU_Float64.ipynb:21
[2] macro expansion
@ ~/.julia/packages/DiffEqGPU/CiiCq/src/DiffEqGPU.jl:63
[3] gpu_continuous_condition_kernel
@ ~/.julia/packages/KernelAbstractions/C8flJ/src/macros.jl:81
[4] gpu_continuous_condition_kernel
@ ./none:0
Reason: unsupported dynamic function invocation (call to -)
Stacktrace:
[1] condition
@ ~/FAMAF/Beca_CIN_Trabajo_Final/skymap/GPU_Julia/pot_central_GPU_Float64.ipynb:21
...
@ ~/.julia/packages/CUDA/DfvRa/src/utilities.jl:25 [inlined]
[33] top-level scope
@ ~/.julia/packages/CUDA/DfvRa/src/pool.jl:490 [inlined]
[34] top-level scope
@ ~/FAMAF/Beca_CIN_Trabajo_Final/skymap/GPU_Julia/pot_central_GPU_Float64.ipynb:0
-
I think that issue arises because you're using global variables inside
condition
, which doesn't make it thecondition
GPU-compatible. The support of callbacks is limited. Hence, use constants directly within thecondition
. -
However, I saw that your
affect!
is ateminate!
which is not supported yet #43. We're working on it.
ok. terminate!
is also not supported by EnsembleGPUkernel()
. I ask because I tried to solve this problem using this Ensemble, for which I first tried the example Callbacks with EnsembleGPUKernel found in the readme, however I got the following error
Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?7ea01c52-ff07-4982-8627-c8c01801da66)
GPU compilation of kernel #tsit5_kernel(CUDA.CuDeviceVector{ODEProblem{SVector{1, Float32}, Tuple{Float32, Float32}, false, SciMLBase.NullParameters, ODEFunction{false, SciMLBase.AutoSpecialize, typeof(f), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, 1}, CUDA.CuDeviceMatrix{SVector{1, Float32}, 1}, CUDA.CuDeviceMatrix{Float32, 1}, Float32, CallbackSet{Tuple{}, Tuple{DiscreteCallback{typeof(condition), typeof(affect!), typeof(SciMLBase.INITIALIZE_DEFAULT), typeof(SciMLBase.FINALIZE_DEFAULT)}}}, CUDA.CuDeviceVector{Float32, 1}, Int64, Nothing, Val{true}) failed
KernelError: passing and using non-bitstype argument
Argument 6 to your kernel function is of type CallbackSet{Tuple{}, Tuple{DiscreteCallback{typeof(condition), typeof(affect!), typeof(SciMLBase.INITIALIZE_DEFAULT), typeof(SciMLBase.FINALIZE_DEFAULT)}}}, which is not isbits:
.discrete_callbacks is of type Tuple{DiscreteCallback{typeof(condition), typeof(affect!), typeof(SciMLBase.INITIALIZE_DEFAULT), typeof(SciMLBase.FINALIZE_DEFAULT)}} which is not isbits.
.1 is of type DiscreteCallback{typeof(condition), typeof(affect!), typeof(SciMLBase.INITIALIZE_DEFAULT), typeof(SciMLBase.FINALIZE_DEFAULT)} which is not isbits.
.save_positions is of type BitVector which is not isbits.
.chunks is of type Vector{UInt64} which is not isbits.
I get the same error in my code
With EnsembleGPUArray, that's a won't fix. We should just throw an error for this, and tell people it's required that they use the kernel generating methods.
Yes, I'll make a PR on it. A lot of stuff needs to be on docs as well.
With the new docs released and terminate!
support in EnsembleGPUKernel
, can you try this again?
It's now documented and all, so closing. EnsembleGPUKernel is the answer, as per the docs.