mozack/abra2

IndexOutOfBoundsException at AltContigGenerator.java#L273

hemp opened this issue · 2 comments

hemp commented

In running abra2, I'm hitting this runtime exception:

INFO	Thu Nov 02 15:50:04 UTC 2017	Abra version: 2.11
INFO	Thu Nov 02 15:50:04 UTC 2017	Abra params: [/hemp/abra2-2.11.jar --tmpdir /hemp/javatmpdir --ref /hg38/Homo_sapiens_assembly38.fasta --dist 1000 --in /hemp/NA12878_chr22.bam --threads 4 --gkl --targets chr22.bed --log error --out output-chr22.bam]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
	at java.util.ArrayList.rangeCheck(ArrayList.java:657)
	at java.util.ArrayList.get(ArrayList.java:433)
	at abra.AltContigGenerator.getAltContigs(AltContigGenerator.java:273)
	at abra.ReAligner.processRegion(ReAligner.java:1233)
	at abra.ReAligner.processChromosomeChunk(ReAligner.java:336)
	at abra.ReAlignerRunnable.go(ReAlignerRunnable.java:21)
	at abra.AbraRunnable.run(AbraRunnable.java:20)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Indel prev = indel.components.get(0);

In reviewing the code it looks like there is a path where indelComponents could be an empty ArrayList based upon the for-loop over CigarElements:

Indel indel = new Indel('C', read.getReferenceName(), indelComponents, firstIdx, SAMRecordUtils.sumBaseQuals(read));

I apologize up front, I haven't narrowed down a smaller recreate yet. This occurs consistently about 40 minutes into a run.

Can you tell me a bit more about your input data? i.e. What aligner are you using? Is there any additional processing done post alignment? Has the input BAM been run through a different ABRA version previously?

hemp commented

Worked with a bioinformatician here that got me sorted. Good call though, it was run through abra1 already though the header was stripped from the bam. What got me down this path was I was profiling G1GC compared to Parallel and CMS with some local bams to see how things performed in docker. Thanks for the response.