Running out of Java Heap space in one of two cohorts.
sinnweja opened this issue · 4 comments
I have a cohort of 4200 subjects that I was able to successfully run with 100G heap space on each chromosome in under 30 minutes each. Ancestry percentages match full-genome expected ancestry of about 60% EUR, 30% AMR, 10% mixed from 2 other ancestries. I have a different ancestry cohort (mix of EUR+AFR) of 10K subjects that runs out of 400G of heap space on the smaller chromosomes. I used gt-samples option to only run flare on 3000 subjects. Still runs out of 400G of heap space. I pre-subset the sample vcf file to 1000 samples and no longer need gt-samples, and runs out of 250G of heap space. Any idea what is causing the heap space to blow up on the second cohort?
Turns out my problems were caused by having the ref-panel file specified incorrectly. I was effectively having flare estimate many more than 5 ancestries. It is running fine now on all 10K subjects in cohort 2, with 5 ancestries. I apologize for the confusion!