SymbolicML/DynamicQuantities.jl

Make SparseArrays a weak dependency?

devmotion opened this issue · 4 comments

In Julia >= 1.10, SparseArrays won't be included in the system image anymore. To reduce loading times, therefore SparseArrays was or is about to be demoted from a regular to a weak dependency for many packages in the ecosystem such as FillArrays, Distances and PDMats. However, if a package in the dependency stack loads SparseArrays all these extensions will be compiled and loaded. Could the SparseArrays dependency of DynamicQuantities also be moved to an extension or removed alltogether to avoid that it triggers all these extensions?

Thanks for raising this @devmotion. I'm not sure if there's a way around this because SparseArrays is actually used internally for the SymbolicDimensions struct:

struct SymbolicDimensions{R} <: AbstractDimensions{R}
_data::SA.SparseVector{R}
SymbolicDimensions(data::SA.SparseVector) = new{eltype(data)}(data)
SymbolicDimensions{_R}(data::SA.SparseVector) where {_R} = new{_R}(data)
end

It is used here because there are ~100 physical symbolic dimensions (m and cm are separate dimensions), so rather than store 100 fields, one for each dimension, I store it as a single sparse vector.

Perhaps it is possible to write a small internal sparse vector library to get this working without loading all of SparseArray.

What do you think?

I wonder if it would be possible to replace _data with a dictionary or, more similar to the sparse array, with two vectors ind::Vector{Int} (or even sym::Vector{Symbol}?) and val::Vector{R} of indices of non-zero symbolic dimensions and their values.

Definitely possible. But it would add to the maintenance/testing burden to need to unit test a custom sparse array package, and possibly introduce new bugs. (Whereas SparseArrays works out-of-the-box)

Maybe my suggestion was not completely clear: I don't think one should implement a new sparse array type but possibly a different data structure (array or two vectors) could provide the same functionality without significant regressions. And since SparseVector is implemented based on two vectors, I think the alternative with two vectors would be a natural candidate (https://github.com/JuliaSparse/SparseArrays.jl/blob/034d23420bba5a8688fba8d857fc7f813f121f31/src/sparsevector.jl#L38-L39).