[Methods] Undesired conversion to a dense matrix
Opened this issue · 3 comments
What happened + What you expected to happen
Current implementation of BottomUpSparse
reconciliation method causes an undesired conversion to a dense matrix, potentially leading to an OOM on large datasets. The lack of speedup was already observed before as noted in makoren's comment in the class definition.
The culprit is the idx_bottom
indexed assignment:
Locally I managed to fix it by replacing this line with the following
for idx in idx_bottom:
P[idx] = S.getrow(idx)
However, the performance is quite suboptimal, but it's sufficient for my use case, hence I would rather raise this issue than create a PR :)
Versions / Dependencies
hierarchicalforecast==0.4.1
Reproduction script
using an S matrix with n>200k leads to OOM with BottomUpSparse reconciliation
Issue Severity
Medium: It is a significant difficulty but I can work around it.
Eventually I managed to achieve a respectable performance through
def _get_PW_matrices(self, S, idx_bottom):
n_hiers, n_bottom = S.shape
P = sparse.lil_matrix(S.shape, dtype=np.float32)
rows = np.array(idx_bottom)
cols = np.arange(n_bottom)
data = np.ones_like(rows, dtype=np.float32)
P = sparse.csr_matrix((data, (rows, cols)), shape=(n_hiers, n_bottom)).T
W = sparse.identity(n_hiers, dtype=np.float32)
return P, W
However I noticed a note about possibly deprecating the idx_bottom
argument in the future, should I open a PR?