CliMA/CalibrateEDMF.jl

Reduce ReferenceStatistics memory allocations

ilopezgp opened this issue · 0 comments

Currently, we are storing the full covariance matrix Γ_full in the ReferenceStatistics struct. Let n be the number of simulations and d the size of observations in each simulation. Since Γ_full is always a block diagonal matrix, we are storing (n*d)^2 floats, when really we only have n(d)^2 non-zero elements.

Solution: Storing the blocks of the diagonal matrix thus leads to a factor n reduction in required storage -- the storage space is no longer quadratic in number of simulations, but linear. Note also that this does not come at a cost of assembly of the covariance matrix, since we do not need Γ_full for the inverse problem, only Γ. Therefore, assembly is only required for offline diagnostics.

Note: By definition ReferenceModels are uncorrelated, if you have correlated simulations (e.g. stochastic ensemble runs), they should be part of the same ReferenceModel.