Check optimality of computeResidual/computeJacobian
friedmud opened this issue · 8 comments
Try overloading computeResidual / computeJacobian directly and check timings.
Need to understand if computeQpResidual/Jacobian is slowing things down tremendously.
Also check if Jacobian loop should be reordered for column major access...
Overriding computeResidual!()
and computeJacobian!()
directly gives a speedup of 30-40% over overriding computeQpResidual()
and computeQpJacobian()
for the "manual" case
However - it doesn't seem to effect AD... so maybe the speedup is all in the Jacobian...
I can confirm... overriding computeJacobian!()
directly is the one that makes a big difference.
Now let's look at loop order in computeJacobian!()
Using @inline
actually makes it even slower...
Let's see if it's the if
... try putting it in the loops
It's not the if
... putting it inside the loops doesn't matter.
BTW: doing the jacobian loop "column major" doesn't help
Specifying the return type of computeQpJacobian()
did help some
Ok... so @inline
and specifying the return type means that just overriding computeQpJacobian()
is just as optimal as overriding computeJacobian!()
Here is an example of the form that should be used:
@inline function computeQpJacobian(kernel::Diffusion, v::Variable, qp::Int64, i::Int64, j::Int64)::Float64
u = kernel.u
if u.id == v.id
return v.grad_phi[qp][j] ⋅ u.grad_phi[qp][i]
end
return 0
end