WansonChoi/CookHLA

can not get alleles result but no error

Opened this issue · 4 comments

Hi @WansonChoi ,

I am running CookHLA with a target data (N larger than 50000) and the 1000G reference data in your software(N=504). Everything went well without an error but no results were achieved. So I am wandering if the strange issue came up because of my sample is bigger than the example in your github from which I used the parameter of "mem"(2g) and "window"(5). The imputation log is as follows:

respri.hg19.hla.MHC.QC.exon2.0.5.raw_imputation_out.log

I will be very grateful if you can reply!

Thanks,
Guo

Hi @WansonChoi ,

I changed my environment and the error can be seen.

The error message always occurs as follows:

ERROR: java.lang.OutOfMemoryError: GC overhead limit exceeded

And my commond is as follows:

python CookHLA.py
-i ./va_2.hla.bed
-hg 19
-o ./va_2.hg19.hla
-ref 1000G_REF/1000G_REF.EUR.chr6.hg18.29mb-34mb.inT1DGC
-gm ./MyAGM/hla.hg19.mach_step.avg.clpsB
-ae ./MyAGM/hla.hg19.aver.erate
-nth 24
Do you have any suggestions for me?

I will be very appericiate it if you can reply!
Guo

Hi @WansonChoi ,

In order to solve the problem above, I edit my -nth to 1, but the following message occur. Could you give me some advice? Thank you very much!

Command line: java -Xmx1917m -jar beagle.24Aug19.3e8.jar
gt=./va_1_20_60.hg19.hla.MHC.QC.vcf
ref=./va_1_20_60/1000G_REF.EUR.chr6.hg18.29mb-34mb.inT1DGC.exon2.phased.vcf
out=./va_1_20_60/va_1_20_60.hg19.hla.MHC.QC.exon2.0.5.raw_imputation_out
impute=true
gp=true
overlap=0.5
err=0.0033310878243513
map=./vaccine_1_20_60/hla.hg19.mach_step.avg.clpsB.exon2.txt
window=5
ne=10000
nthreads=1

Reference samples: 503
Study samples: 55,712
Window 1 (6:29602876-31520492)

Reference markers: 3,397
Study markers: 1,564

Burnin iteration 1: 13 minutes 4 seconds
Burnin iteration 2: 12 minutes 9 seconds
Burnin iteration 3: 11 minutes 46 seconds
Burnin iteration 4: 11 minutes 28 seconds
Burnin iteration 5: 11 minutes 19 seconds
Burnin iteration 6: 12 minutes 18 seconds

Phasing iteration 1: 11 minutes 50 seconds
Phasing iteration 2: 10 minutes 37 seconds
Phasing iteration 3: 10 minutes 4 seconds
Phasing iteration 4: 9 minutes 48 seconds
Phasing iteration 5: 9 minutes 27 seconds
Phasing iteration 6: 8 minutes 43 seconds
Phasing iteration 7: 7 minutes 56 seconds
Phasing iteration 8: 7 minutes 17 seconds
Phasing iteration 9: 6 minutes 16 seconds
Phasing iteration 10: 19 minutes 51 seconds
Phasing iteration 11: 3 minutes 54 seconds
Phasing iteration 12: 2 minutes 34 seconds
Exception in thread "main" java.lang.NullPointerException
at imp.RefHapHash.i2hap(RefHapHash.java:156)
at imp.RefHapHash.(RefHapHash.java:82)
at imp.ImputedVcfWriter.appendRecords(ImputedVcfWriter.java:110)
at main.WindowWriter.toByteArray(WindowWriter.java:147)
at main.WindowWriter.lambda$printImputed$0(WindowWriter.java:136)
at java.util.stream.IntPipeline$4$1.accept(IntPipeline.java:250)
at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110)
at java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:693)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.Nodes$SizedCollectorTask.compute(Nodes.java:1878)
at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:401)
at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734)
at java.util.stream.Nodes.collect(Nodes.java:325)
at java.util.stream.ReferencePipeline.evaluateToNode(ReferencePipeline.java:109)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:540)
at java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:438)
at main.WindowWriter.printImputed(WindowWriter.java:137)
at main.Main.printOutput(Main.java:193)
at main.Main.phaseData(Main.java:163)
at main.Main.main(Main.java:114)

Guo

@xingejun

Hi, xingejun. Thank you for your interest in CookHLA.

It seems you allocated not enough memory to each imputation. Could you try the last command with the '-mem' argument?

Because you said that # of your target data is >50k and this is quite large, you try implementing CookHLA serially first, i.e. not with multiprocessing('-mp' argument).

I think this issue(#14 (comment)) is similar to your case and might be helpful to you.

Hi @WansonChoi ,

This problem has been solved with the suggestions you gave. Thank you very much.

Xin