SciML/DiffEqGPU.jl

ContinuousCallback does not work correctly with EnsembleGPUArray()

martin-abrudsky opened this issue · 8 comments

Hello, I am solving a central potential problem with EnsembleGPUArray() for multiple initial conditions, when I add the ContinuousCallback the first 800 trajectories ignore the affect!(integrator) and the last 200 do the cut correctly. If I solve for 10_000 trajectories, the first 8000 trajectories ignore the cut and the last 2000 do it correctly, similarly for 1_000_000 trajectories. the code is the following

using CUDA
using DiffEqGPU
using NPZ
using OrdinaryDiffEq
using Plots

"""
     pot_central!(du,u,p,t)
     u=[x,dx,y,dy,# of steps]
     p=[k,m,dt]
"""
function pot_central!(du,u,p,t)
        r3 = ( u[1]^2 + u[3]^2 )^(3/2)
     du[1] = u[2]                                     # u[2]= dx
     du[2] = -( p[1]*u[1] ) / ( p[2]*r3 )   
     du[3] = u[4]                                     # u[4]= dy
     du[4] = -( p[1]*u[3] ) / ( p[2]*r3 )
     du[5] = 1/p[3]
end

T=100.0
trajectories=1_000
dt=0.01

 u_rand = convert(Array{Float64},npzread("IO_GPU/IO_u0.npy"))
       u0 = [2.0, 2.0, 1.0, 1.5,dt]
         p = [1.0,1.0,dt]              #[k,m,dt]
 tspan = (0.0,T)
      R2 = [4.5,5_000.0]           # R2=[Rmin2,Rmax2]

prob= ODEProblem{true}(pot_central!,u0,tspan,p)
prob_func= (prob,i,repeat) -> remake(prob,u0 =  [u_rand[i,:];dt].*u0 + [1.0,1.0,1.0,1.0,dt]) 
Ensemble_Problem=EnsembleProblem(prob,prob_func=prob_func,safetycopy=false) 

function condition(u,t,integrator)
        r2 = u[1]^2 + u[3]^2
        (R2[2] - r2)*(r2 - R2[1])
end

affect!(integrator) = terminate!(integrator)
gpu_cb = ContinuousCallback(condition, affect!;save_positions=(true,false),rootfind=true,interp_points=0,abstol=1e-7,reltol=0)

sol= solve(Ensemble_Problem,
                 Tsit5(),
                 EnsembleGPUArray(),
                 dt=0.01,
                 trajectories=trajectories,
                 #batch_size=10_000,
                 callback = gpu_cb,
                 adaptive=false,
                 save_everystep=false
                 )
`

The solutions for some trajectories are as follows

`
#u=[x,dx,y,dy,# of steps]

sol[1]

retcode: Success
Interpolation: 1st order linear
t: 2-element Vector{Float64}:
   0.0
 100.0
u: 2-element Vector{SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}}:
 [1.6916643844888346, 1.3952481788899702, 1.4257929742644806, 2.3254414019025376, 0.0101]
 [130.3632437338083, 1.283605087637319, 221.2362706355311, 2.193291036113921, 10000.010099999996]

sol[800]

retcode: Success
Interpolation: 1st order linear
t: 2-element Vector{Float64}:
   0.0
 100.0
u: 2-element Vector{SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}}:
 [2.3286232359143177, 2.0849960853408054, 1.5481395680210261, 1.0878883702422468, 0.0101]
 [198.03489179352627, 1.9510332210035648, 102.76307791913189, 1.008911077728196, 10000.010099999996]

sol[801]

retcode: Terminated
Interpolation: 1st order linear
t: 2-element Vector{Float64}:
  0.0
 28.22912660730258
u: 2-element Vector{Vector{Float64}}:
 [1.3698092340031855, 2.018364836988968, 1.9808709009626209, 1.5912143969170325, 0.0101]
 [55.503066919039185, 1.9067772301987758, 43.81106666791687, 1.4723417477028182, 2822.922760730258]

sol[802]

retcode: Terminated
Interpolation: 1st order linear
t: 2-element Vector{Float64}:
  0.0
 30.891877286483595
u: 2-element Vector{Vector{Float64}}:
 [2.2032317939655903, 1.2169273303092267, 1.6948012447348866, 2.0018032820652945, 0.0101]
 [36.85704302055922, 1.1131346681788845, 60.34532608065554, 1.8862192286561592, 3089.1978286483586]

From already thank you very much

Can you try the following:

sol= solve(Ensemble_Problem,
                 Tsit5(),
                 EnsembleGPUArray(),
                 dt=0.01,
                 trajectories=trajectories,
                 #batch_size=10_000,
                 callback = gpu_cb,
                 adaptive=false,
                 save_everystep=false,
                 merge_callbacks = true
                 )

Hello, I tried your recommendation but it gives me this error

Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?79e64a19-8dad-4456-bf40-7452538e8508)
InvalidIRError: compiling kernel #gpu_continuous_condition_kernel(KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, typeof(condition), CuDeviceVector{Float64, 1}, CuDeviceMatrix{Float64, 1}, Float64, CuDeviceMatrix{Float64, 1}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to convert)
Stacktrace:
 [1] setindex!
   @ ~/.julia/packages/CUDA/DfvRa/src/device/array.jl:194
 [2] macro expansion
   @ ~/.julia/packages/DiffEqGPU/CiiCq/src/DiffEqGPU.jl:63
 [3] gpu_continuous_condition_kernel
   @ ~/.julia/packages/KernelAbstractions/C8flJ/src/macros.jl:81
 [4] gpu_continuous_condition_kernel
   @ ./none:0
Reason: unsupported dynamic function invocation (call to getindex)
Stacktrace:
 [1] condition
   @ ~/FAMAF/Beca_CIN_Trabajo_Final/skymap/GPU_Julia/pot_central_GPU_Float64.ipynb:21
 [2] macro expansion
   @ ~/.julia/packages/DiffEqGPU/CiiCq/src/DiffEqGPU.jl:63
 [3] gpu_continuous_condition_kernel
   @ ~/.julia/packages/KernelAbstractions/C8flJ/src/macros.jl:81
 [4] gpu_continuous_condition_kernel
   @ ./none:0
Reason: unsupported dynamic function invocation (call to -)
Stacktrace:
 [1] condition
   @ ~/FAMAF/Beca_CIN_Trabajo_Final/skymap/GPU_Julia/pot_central_GPU_Float64.ipynb:21
...
    @ ~/.julia/packages/CUDA/DfvRa/src/utilities.jl:25 [inlined]
 [33] top-level scope
    @ ~/.julia/packages/CUDA/DfvRa/src/pool.jl:490 [inlined]
 [34] top-level scope
    @ ~/FAMAF/Beca_CIN_Trabajo_Final/skymap/GPU_Julia/pot_central_GPU_Float64.ipynb:0
  1. I think that issue arises because you're using global variables inside condition, which doesn't make it the condition GPU-compatible. The support of callbacks is limited. Hence, use constants directly within the condition.

  2. However, I saw that your affect! is a teminate! which is not supported yet #43. We're working on it.

cc @ChrisRackauckas

ok. terminate! is also not supported by EnsembleGPUkernel(). I ask because I tried to solve this problem using this Ensemble, for which I first tried the example Callbacks with EnsembleGPUKernel found in the readme, however I got the following error

Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?7ea01c52-ff07-4982-8627-c8c01801da66)
GPU compilation of kernel #tsit5_kernel(CUDA.CuDeviceVector{ODEProblem{SVector{1, Float32}, Tuple{Float32, Float32}, false, SciMLBase.NullParameters, ODEFunction{false, SciMLBase.AutoSpecialize, typeof(f), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, 1}, CUDA.CuDeviceMatrix{SVector{1, Float32}, 1}, CUDA.CuDeviceMatrix{Float32, 1}, Float32, CallbackSet{Tuple{}, Tuple{DiscreteCallback{typeof(condition), typeof(affect!), typeof(SciMLBase.INITIALIZE_DEFAULT), typeof(SciMLBase.FINALIZE_DEFAULT)}}}, CUDA.CuDeviceVector{Float32, 1}, Int64, Nothing, Val{true}) failed
KernelError: passing and using non-bitstype argument

Argument 6 to your kernel function is of type CallbackSet{Tuple{}, Tuple{DiscreteCallback{typeof(condition), typeof(affect!), typeof(SciMLBase.INITIALIZE_DEFAULT), typeof(SciMLBase.FINALIZE_DEFAULT)}}}, which is not isbits:
  .discrete_callbacks is of type Tuple{DiscreteCallback{typeof(condition), typeof(affect!), typeof(SciMLBase.INITIALIZE_DEFAULT), typeof(SciMLBase.FINALIZE_DEFAULT)}} which is not isbits.
    .1 is of type DiscreteCallback{typeof(condition), typeof(affect!), typeof(SciMLBase.INITIALIZE_DEFAULT), typeof(SciMLBase.FINALIZE_DEFAULT)} which is not isbits.
      .save_positions is of type BitVector which is not isbits.
        .chunks is of type Vector{UInt64} which is not isbits.

I get the same error in my code

With EnsembleGPUArray, that's a won't fix. We should just throw an error for this, and tell people it's required that they use the kernel generating methods.

Yes, I'll make a PR on it. A lot of stuff needs to be on docs as well.

With the new docs released and terminate! support in EnsembleGPUKernel, can you try this again?

It's now documented and all, so closing. EnsembleGPUKernel is the answer, as per the docs.