AlexTISYoung/snipar

Memory Allocation Error in impute.py

AnnabelPerry opened this issue · 4 comments

Hello, I am attempting to run impute.py in a conda environment with Python version 3.9.16, pandas version 1.1.4. I am encountering the following error:

2023-06-27 19:22:09,875 INFO impute - main: creating pedigree ...
2023-06-27 19:22:09,981 INFO preprocess_data - create_pedigree: loaded kinship file
2023-06-27 19:22:10,063 INFO preprocess_data - create_pedigree: loaded agesex file
2023-06-27 19:22:10,129 INFO preprocess_data - create_pedigree: creating age and sex dictionaries
2023-06-27 19:22:10,192 INFO preprocess_data - create_pedigree: dictionaries created
2023-06-27 19:22:10,192 INFO preprocess_data - create_pedigree: creating pedigree objects
2023-06-27 19:22:10,261 INFO impute - main: pedigree loaded.
2023-06-27 19:22:10,265 INFO impute - run_imputation: processing /n/groups/reich/anp9168/VCFs/chr1,None
2023-06-27 19:22:10,265 INFO preprocess_data - prepare_data: For file /n/groups/reich/anp9168/VCFs/chr1;None: Finding which chromosomes
2023-06-27 19:22:27,153 INFO preprocess_data - prepare_data: with chromosomes [1] initializing non_gts data
2023-06-27 19:22:27,154 INFO preprocess_data - prepare_data: with chromosomes [1] loading and filtering pedigree file ...
2023-06-27 19:22:27,985 INFO preprocess_data - prepare_data: Adding control to the pedigree ...
2023-06-27 19:22:28,008 INFO preprocess_data - prepare_data: Control Added.
2023-06-27 19:22:28,363 INFO preprocess_data - prepare_data: with chromosomes [1] loading bim file ...
2023-06-27 19:22:28,363 INFO preprocess_data - prepare_data: with chromosomes [1] loading and transforming ibd file ...
2023-06-27 19:22:31,564 INFO preprocess_data - prepare_data: ibd loaded.
2023-06-27 19:22:31,564 INFO preprocess_data - prepare_data: with chromosomes ['1'] initializing non_gts data done ...
2023-06-27 19:22:31,733 INFO preprocess_data - prepare_gts: with chromosomes ['1'] initializing gts data with start=0 end=58745
Traceback (most recent call last):
  File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 432, in <module>
    main(args)
  File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 326, in main
    run_imputation(args)
  File "/home/anp9168/anaconda3/envs/sniparEnv/bin/impute.py", line 208, in run_imputation
    phased_gts, unphased_gts, iid_to_bed_index, pos, freqs, hdf5_output_dict = prepare_gts(phased_address, unphased_address, bim, pedigree_output, ped_ids, chromosomes, start, end, pcs, pc_ids, find_optimal_pc)
  File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/snipar/imputation/preprocess_data.py", line 713, in prepare_gts
    probs= bgen.read((slice(0, len(bgen.samples)),slice(start, end)))        
  File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/bgen_reader/_bgen2.py", line 552, in read
    val = np.full(
  File "/home/anp9168/anaconda3/envs/sniparEnv/lib/python3.9/site-packages/numpy/core/numeric.py", line 343, in full
    a = empty(shape, dtype, order)
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 853. GiB for an array with shape (487409, 58745, 4) and data type float64

Here is the code I ran:

source activate sniparEnv
unset PYTHONPATH

impute.py -c --ibd IBD_Chr@.ibd --bgen chr@ --out Imputed_Chr@ --king FirstDegreeKING_forImputation.kin0 --agesex FirstDegreeAgeSex_forImputation.txt

Initially, I gave the --ibd flag the IBD_Chr@ prefix without the .ibd suffix, but got the following error:

FileNotFoundError: [Errno 2] No such file or directory: 'IBD_Chr1.segments.gz'

I checked my ibd.py outputs and they all are named in the format IBD_Chr@.ibd.segments.gz and IBD_Chr@.l2.ldscore.gz, so I added the .ibd suffix to help snipar find the IBD_Chr@.ibd.segments.gz files, but I worry this introduced a new error

Thank you for your quick response - I've tried running with the --batch_size argument (and also with a single hyphen as in -batch_size) set to 5000, but in both cases I get impute.py: error: unrecognized arguments: -batch_size 5000

That worked, thanks!