JuliaDiff/ForwardDiff.jl

Highly invalidating method: promote_rule(::Type{R}, ::Type{ForwardDiff.Dual{T, V, N}}) where {R<:Real, T, V, N}

ChrisRackauckas opened this issue · 2 comments

Found by:

using SnoopCompile
invalidations = @snoopr begin
    using OrdinaryDiffEq

    function lorenz(du, u, p, t)
        du[1] = 10.0(u[2] - u[1])
        du[2] = u[1] * (28.0 - u[3]) - u[2]
        du[3] = u[1] * u[2] - (8 / 3) * u[3]
    end
    u0 = [1.0; 0.0; 0.0]
    tspan = (0.0, 100.0)
    prob = ODEProblem{true,false}(lorenz, u0, tspan)
    alg = Rodas5()
    tinf = solve(prob, alg)
end;

trees = SnoopCompile.invalidation_trees(invalidations);

@show length(SnoopCompile.uinvalidated(invalidations)) # show total invalidations

show(trees[end]) # show the most invalidated method

# Count number of children (number of invalidations per invalidated method)
n_invalidations = map(trees) do methinvs
    SnoopCompile.countchildren(methinvs)
end

import Plots
Plots.plot(
    1:length(trees),
    n_invalidations;
    markershape=:circle,
    xlabel="i-th method invalidation",
    label="Number of children per method invalidations"
)

invalidation

with SciML/SciMLBase.jl#348 it's the number one invalidating method.

inserting promote_rule(::Type{R}, ::Type{ForwardDiff.Dual{T, V, N}}) where {R<:Real, T, V, N} in ForwardDiff at C:\Users\accou\.julia\packages\ForwardDiff\QdStj\src\dual.jl:427 invalidated:
   backedges: 1: superseding promote_rule(::Type, ::Type) in Base at promotion.jl:310 with MethodInstance for promote_rule(::Type{Int64}, ::Type{S} where S<:Real) (5 children)
              2: superseding promote_rule(::Type, ::Type) in Base at promotion.jl:310 with MethodInstance for promote_rule(::Type{UInt8}, ::Type) (8 children)
              3: superseding promote_rule(::Type, ::Type) in Base at promotion.jl:310 with MethodInstance for promote_rule(::Type{UInt16}, ::Type) (11 children)
              4: superseding promote_rule(::Type, ::Type) in Base at promotion.jl:310 with MethodInstance for promote_rule(::Type{Int64}, ::Type) (285 children)
   19 mt_cache

With the soon-to-be-released SnoopCompile 2.9.6 and a modified script:

using SnoopCompileCore
invs = @snoopr begin
    using OrdinaryDiffEq
    function lorenz(du, u, p, t)
        du[1] = 10.0(u[2] - u[1])
        du[2] = u[1] * (28.0 - u[3]) - u[2]
        du[3] = u[1] * u[2] - (8 / 3) * u[3]
    end
end
tinf = @snoopi_deep begin
    u0 = [1.0; 0.0; 0.0]
    tspan = (0.0, 100.0)
    prob = ODEProblem{true,false}(lorenz, u0, tspan)
    alg = Rodas5()
    solve(prob, alg)
end
using SnoopCompile
trees = invalidation_trees(invs)
staletrees = precompile_blockers(trees, tinf)

there appear to be no invalidations that affect your workload (staletrees is empty). Moreover, the promote_rule invalidation is pretty far back in the list and has just a few children:

inserting promote_rule(::Type{R}, ::Type{ForwardDiff.Dual{T, V, N}}) where {R<:Real, T, V, N} @ ForwardDiff ~/.julia/packages/ForwardDiff/QdStj/src/dual.jl:427 invalidated:
   backedges: 1: superseding promote_rule(::Type, ::Type) @ Base promotion.jl:319 with MethodInstance for promote_rule(::Type{Int128}, ::Type) (1 children)
              2: superseding promote_rule(::Type, ::Type) @ Base promotion.jl:319 with MethodInstance for promote_rule(::Type{Int64}, ::Type{S} where S<:Real) (1 children)
              3: superseding promote_rule(::Type, ::Type) @ Base promotion.jl:319 with MethodInstance for promote_rule(::Type{Int128}, ::Type) (3 children)
              4: superseding promote_rule(::Type, ::Type) @ Base promotion.jl:319 with MethodInstance for promote_rule(::Type{Int64}, ::Type) (6 children)

Nevertheless there is new compilation. ProfileView.view(flamegraph(tinf)) says that the base of the two expensive inferences are:

/home/tim/.julia/packages/DiffEqBase/9H9Sj/src/solve.jl:813, MethodInstance for CommonSolve.solve(::ODEProblem{Vector{Float64}, Tuple{Float64, Float64}, true, SciMLBase.NullParameters, ODEFunction{true, SciMLBase.FunctionWrapperSpecialize, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, ::Rodas5{0, true, Nothing, typeof(OrdinaryDiffEq.DEFAULT_PRECS), Val{:forward}, true, nothing})
/home/tim/.julia/packages/DiffEqBase/9H9Sj/src/solve.jl:464, MethodInstance for DiffEqBase.solve_call(::ODEProblem{Vector{Float64}, Tuple{Float64, Float64}, true, SciMLBase.NullParameters, ODEFunction{true, SciMLBase.FunctionWrapperSpecialize, FunctionWrappersWrappers.FunctionWrappersWrapper{Tuple{FunctionWrappers.FunctionWrapper{Nothing, Tuple{Vector{Float64}, Vector{Float64}, SciMLBase.NullParameters, Float64}}, FunctionWrappers.FunctionWrapper{Nothing, Tuple{Vector{ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, Float64}, Float64, 1}}, Vector{ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, Float64}, Float64, 1}}, SciMLBase.NullParameters, Float64}}, FunctionWrappers.FunctionWrapper{Nothing, Tuple{Vector{ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, Float64}, Float64, 1}}, Vector{Float64}, SciMLBase.NullParameters, ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, Float64}, Float64, 1}}}, FunctionWrappers.FunctionWrapper{Nothing, Tuple{Vector{ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, Float64}, Float64, 1}}, Vector{ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, Float64}, Float64, 1}}, SciMLBase.NullParameters, ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, Float64}, Float64, 1}}}}, false}, LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, ::Rodas5)

The first is specialization on lorenz, which makes sense. (That's also the more expensive, inference-wise, of the two. I haven't checked codegen.) The second doesn't appear to contain lorenz, so I'm not sure why it's being precompiled. Maybe there isn't the exact equivalent inside the @precompile_all_calls?

In any case, I suspect this can be closed, either that or it needs tracking down about why there are a lot of children in some cases but very few with this script.

I'm seeing no precompile blocker invalidations too:

using SnoopCompileCore
invs = @snoopr begin
    using OrdinaryDiffEq
    function lorenz(du, u, p, t)
        du[1] = 10.0(u[2] - u[1])
        du[2] = u[1] * (28.0 - u[3]) - u[2]
        du[3] = u[1] * u[2] - (8 / 3) * u[3]
    end
end
tinf = @snoopi_deep begin
    u0 = [1.0; 0.0; 0.0]
    tspan = (0.0, 100.0)
    prob = ODEProblem{true}(lorenz, u0, tspan)
    alg = Rodas5()
    solve(prob, alg)
end
using SnoopCompile
trees = invalidation_trees(invs)
staletrees = precompile_blockers(trees, tinf)

The part that's compiling is the part that's before the function specialization fix is applied. Looks like this is fine then.