Memory usage can be excessive

Question

Memory usage can be excessive

Opened this issue a month ago · 7 comments

I've been trying this on our HPC cluster and I constantly run out of memory. Data sets:

16 MeCap samples with 18~50MM reads (depends on the sample, anything > 50M has been downsampled to 50M)
16 Input files with 5MM reads each (downsampled because they were higher)

I can load approximately no more than 4-5 samples, and at the point of loading input files, memory usage goes through the roof (I've measured spikes of 220G) and I run above the limits set in my user accounts, which means the job is killed.

@SPPearce, did you ever see such a high memory usage?

Alternatively I could batch analyses, but I have no idea if qsea objects can be combined.

Answer 1 · 2024-09-24T09:28:28.000Z

Eww, that is excessively high amounts of memory.
How many cores are you trying to use? It will try and load $n$ files in parallel, where $n$ is the number of cores you have set in setMesaParallel. If you lower that, does the same code work?
You can combine qseaSets yes, using combineQsets (or combineQsetsList). So you can paralellise by making individual qseaSets and then merging them afterwards. The one thing to be aware of is normalisation at that point. The qsea method is to use TMM normalisation, but I don't believe that is necessary (see one of the help files for my reasoning; one reason being this merging), and so I don't recommend doing that. The mesa functions therefore don't do TMM normalisation.
Even if you want to do TMM normalisation, you can perform that afterwards once you have all the samples together into one qseaSet object.

Answer 2 · 2024-09-24T09:38:53.000Z

If I use qseaInput as CNVmethod I don't see this spike occurring (I guess it happens during HHMcopy, unfortunately profiling for R is pretty poor). However I don't know how it is different from the regular HMMcopy invocation. However, it breaks elsewhere so I'm back to square one.

Parallelizing doesn't seem to change the outcome: each parallel process consumes 10G tops, it's when the workers have ended their jobs that I see memory fluctuating between 90 and 100G, before shooting up to 240G (so far, I'm getting close to the hard limits I have on that HPC platform).

Answer 3 · 2024-09-24T12:46:22.000Z

Batching won't help unfortunately as there's another problem as dplyr complains about too many rows when combining.

Answer 4 · 2024-09-24T18:25:42.000Z

Can you share the command you are giving?

Answer 5 · 2024-09-30T08:46:00.000Z

I wonder if the massive duplication we found in #26 is also responsible for the large memory usage.

Answer 6 · 2024-10-01T08:35:07.000Z

Yes, it is related. Fixing the issue there reduces massively the use of memory.

Answer 7 · 2024-10-01T15:21:12.000Z

Great, so we just need to catch the underlying error then.