broadinstitute/picard

PICARD termination ,

Soumyadutta-basak opened this issue · 2 comments

java -jar /Bioinfo/picard-2.21.4/picard.jar MarkDuplicates INPUT=Merged_BW-1.bam OUTPUT=Merged_BW-1_mdup.bam METRICS_FILE=BW-1.txt REMOVE_DUPLICATES=false
This is command i am using for marking the duplicates and the error is getting us

********** NOTE: Picard's command line syntax is changing.


********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)


********** The command line looks like this in the new syntax:


********** MarkDuplicates -INPUT Merged_BW-1.bam -OUTPUT Merged_BW-1_mdup.bam -METRICS_FILE BW-1.txt -REMOVE_DUPLICATES false


18:22:39.974 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/Bioinfo/picard-2.21.4/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Sat May 27 18:22:39 IST 2023] MarkDuplicates INPUT=[Merged_BW-1.bam] OUTPUT=Merged_BW-1_mdup.bam METRICS_FILE=BW-1.txt REMOVE_DUPLICATES=false MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 TAG_DUPLICATE_SET_MEMBERS=false REMOVE_SEQUENCING_DUPLICATES=false TAGGING_POLICY=DontTag CLEAR_DT=true DUPLEX_UMI=false ADD_PG_TAG_TO_READS=true ASSUME_SORTED=false DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates READ_NAME_REGEX=<optimized capture of last three ':' separated fields as numeric values> OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 MAX_OPTICAL_DUPLICATE_SET_SIZE=300000 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Sat May 27 18:22:39 IST 2023] Executing as sandeeplab1@bioinfo on Linux 5.19.0-32-generic amd64; OpenJDK 64-Bit Server VM 11.0.15-internal+0-adhoc..src; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.21.4-SNAPSHOT
INFO 2023-05-27 18:22:40 MarkDuplicates Start of doWork freeMemory: 149162632; totalMemory: 155189248; maxMemory: 32178700288
INFO 2023-05-27 18:22:40 MarkDuplicates Reading input file and constructing read end information.
INFO 2023-05-27 18:22:40 MarkDuplicates Will retain up to 116589493 data points before spilling to disk.
[Sat May 27 18:22:41 IST 2023] picard.sam.markduplicates.MarkDuplicates done. Elapsed time: 0.02 minutes.
Runtime.totalMemory()=1895825408
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: Value was put into PairInfoMap more than once. 1: RGNB552198:15:HNHMHBGXH:4:21508:8996:12908
at htsjdk.samtools.CoordinateSortedPairInfoMap.ensureSequenceLoaded(CoordinateSortedPairInfoMap.java:133)
at htsjdk.samtools.CoordinateSortedPairInfoMap.remove(CoordinateSortedPairInfoMap.java:86)
at picard.sam.markduplicates.util.DiskBasedReadEndsForMarkDuplicatesMap.remove(DiskBasedReadEndsForMarkDuplicatesMap.java:61)
at picard.sam.markduplicates.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:559)
at picard.sam.markduplicates.MarkDuplicates.doWork(MarkDuplicates.java:257)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)

Can anybody help with this ?

@Soumyadutta-basak It looks like you have two reads both marked first-in-pair and primary reads with the same read name, which is not valid. Try running ValidateSamFile to test your file.

kockan commented

Closing this issue for now. Feel free to reopen if there are any updates.