morrowcj/remotePARTS

improve memory of parallel fitGLS_partition

Opened this issue · 1 comments

Problem

The parallel partitioned GLS is driven by the function MC_GLSpart(). This function utilizes foreach(i = 1:npart, ...) %dopar% {...} syntax. This formulation has the entire dataset imported on each instance (thread). That leads to memory usage snowballing quite quickly (ncores $\times$ the size of the data object).

Solution

foreach() accepts an iterator that allows data to be constructed on the fly. In short, this could allow only the data from the partition of interest to be imported for a given instance. The upshot is that the total memory usage shouldn't be much greater than the total size of the original object. So, we should swap i = 1:npart with an iterator to provide partitions.

The recommended solution to #13 may also be useful here.