broadinstitute/picard

Mark Duplicates - Not Enough Fields Exception

tanyasarkjain opened this issue · 2 comments

I am getting the above error when I am using piccard's mark duplicates on my bam file, (from ncbi SRA data). I set the read groups manually for my bam file to be @rg ID:C30H6AC.1 SM:SRR1505694 PL:ILLUMINA

Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. Not enough fields; File SRR1505694.sam; Line 62070660
Line: 3P36HQ1:269:C30H6ACXX:2:1107:12322:7607 147 chr12 95320088 60 100M = 95320047 -141 ATTAAATTAATTAACATTTATGTAAAGTGCCTGGAAGGCAGTGTTTGCTATTATTATCTTCCTTACTGTATCGAGTACATT
at htsjdk.samtools.SAMLineParser.reportFatalErrorParsingLine(SAMLineParser.java:446)
at htsjdk.samtools.SAMLineParser.parseLine(SAMLineParser.java:231)
at htsjdk.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:248)
at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:236)
at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:212)
at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:569)
at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:543)
at picard.sam.markduplicates.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:438)
at picard.sam.markduplicates.MarkDuplicates.doWork(MarkDuplicates.java:222)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:205)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:94)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:104)

@tanyasarkjain Looks like this line is missing base qualities. Does your file pass ValidateSamFile? We also recommend using Picard's AddOrReplaceReadGroups instead of manually editing a file and retrying.

kockan commented

Closing this issue for now. Feel free to reopen if there are any updates.