jlchan/FluxDiffUtils.jl

Large number of allocations for sparse operators

Closed this issue · 3 comments

Passing in sparse matrices to hadamard_sum_ATr! seems slower than passing in dense matrices.

Fixed for sparse/dense matrices for hadamard_sum and sparse matrices for hadamard_jacobian as of 0d6f458.

hadamard_jacobian still showing type instability for dense matrices. profile flags dFijQ = sum(bmult.(getindex.(dFij,m,n),A_ij_list)) as the culprit...

Some weird behaviors:

julia> @btime hadamard_jacobian!($Adense,$Qxy,:skew,$dF,$U);
  1.540 ms (2 allocations: 416 bytes)

julia> @btime hadamard_jacobian!($Asparse,$Qxy,:skew,$dF,$U);
  3.782 ms (2 allocations: 416 bytes)

julia> @btime hadamard_jacobian!($Adense,$QxyDense,:skew,$dF,$U);
  712.596 ms (9871876 allocations: 1.25 GiB)

Jacobian assembly 2.5x slower for sparse matrices (probably due to #8)

For dense matrices, there's a type instability for hadamard_jacobian.

Type instability should be fixed in eda109a (missing specialization of dF in hadamard_jacobian! for dense matrices)