Is it possible with ~43k sample?

Question

Is it possible with ~43k sample?

Closed this issue 2 years ago · 3 comments

I set up CookHLA for our study containing ~43k samples -- it failed with BEAGLE although I reserved 250GB RAM; would it be possible to do so? When I used only 2,491 samples it worked.

The screen output is as follows for the ~43k sample,

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/rds/user/jhz22/hpc-work/CookHLA/src/HLA_Imputation_BEAGLE5.py", line 555, in IMPUTE
subprocess.run(re.split('\s+', command), check=True, stdout=f_log, stderr=f_log)
File "/usr/local/software/master/python/3.7/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['java', '-Djava.io.tmpdir=/home/jhz22/Caprion/analysis/work/hla_CookHLA.javatmpdir', '-Xmx250000m', '-jar', './dep>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/software/master/python/3.7/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/rds/user/jhz22/hpc-work/CookHLA/src/HLA_Imputation_BEAGLE5.py", line 559, in IMPUTE
raise CookHLAImputationError(std_ERROR_MAIN_PROCESS_NAME + "Imputation({} / overlap:{}) failed.\n".format(_exonN, _overlap))
src.CookHLAError.CookHLAImputationError:
[HLA_Imputation_BEAGLE5.py::ERROR]: Imputation(exon3 / overlap:1.5) failed.

"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "CookHLA.py", line 1035, in
f_save_IMPUTATION_INPUT=args.save_IMPUTATION_INPUT)
File "CookHLA.py", line 862, in CookHLA
f_measureAcc_v2=f_measureAcc_v2)
File "/rds/user/jhz22/hpc-work/CookHLA/src/HLA_Imputation_BEAGLE5.py", line 179, in init
self.dict_IMP_Result[_exonN][_overlap] = dict_Pool[_exonN][_overlap].get()
File "/usr/local/software/master/python/3.7/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
src.CookHLAError.CookHLAImputationError:
[HLA_Imputation_BEAGLE5.py::ERROR]: Imputation(exon3 / overlap:1.5) failed.

Answer 1 · 2022-05-20T02:18:30.000Z

@jinghuazhao

Hi, Thank you for using CookHLA.

We need the 9 beagle log files of the 9 imputations to look into more closely. If something like "Java Heap Memory size error" is written in any of those log files, then your suspect will be right.

If it is, I suggest you re-run the CookHLA imputation serially, i.e. No multiprocessing by excluding the '-mp' argument while giving the maximum memory that your system can allocate to a beagle imputation with the '-mem' argument.

Answer 2 · 2022-05-21T10:03:27.000Z

Indeed it seemed your suggestion had it worked! I have dropped -mp and -mem 250g (not sure needing that much now) was kept intact. I also had HIBAG results earlier on, so it remains unsettled with SNP2HLA which uses BEAGLE 3.04 (without an explicit option -nthread.s as in 5.1). Thank you again.

Answer 3 · 2022-05-21T13:13:56.000Z

All worked!